DOCOHENT fiESDHE 



ED 135 616 



SE 021 985 



AQTHOfi 
TITLE 



PDB DATE 
NOTE 



EDBS PRICE 
DESCBIETOJBS 



IDENTIFIEBS 



Huudr Orville George 

The Construction of an Instrument to Heasure 
Proportional Reasoning Ability of Junior High 
Pupils* 
Dec 76 

280p. ; Ph-D. Dissertation^ University of Minnesota; 
Not avaxTaBle in hard copy due to marginal legibility 
of original document 

HF-$0.83 Plus Postage. HC Not Available from EDES, 
♦Cognitive Development; Developmental Tasks; Doctoral 
Theses; *Educational Research; Learning Theories: 
Measurement Instruments; *Physical Sciences; Science 
Education; Secondary Education; ^Secondary School 
Science; ♦Tests 

*Piaget (Jean); Research Reports 



ABSTRACT 



The purpose of 
papers-pencil test of Piagetian 
junior high school students in 
thousand twenty-seven students 



this study was to develop a 
levels of proportional thij^king for 
the contest of physical science. Two 
were tested to develop the instrument 
and the description of its characteristics. The final form consisted 
of 2^ items with four subtests each of six items for Piagetian 
levels: Concrete Operational I, Concrete Operational II, Formal 
Operational I, and Formal Operational II. Piagetian task interviejws 
fiere also given to a group of students, and the paper-pencil test 
results correlated positively with the task results of the students 
who took both tests. Content, concurrent construct, divergent, and 
convergent validity measurements showed the paper-pencil test to be 
valid. The test was also shown to have a high reliability and good 
item discrimination between proportional reasoning levels. (HH) 



* Documents acquired by ERIC include many informal unpublished * 

* materials not available from other sources. ERIC makes every effort ♦ 

* to obtain the best copy available. Nevertheless, items of marginal * 

* reproducibility are often encountered and this affects the quality ^ 

* of the microfiche and hardcopy reproductions EEIC makes available ♦ 

* via the EEIC Document Reproduction Service (EDBS) . EDRS is not * 

* responsible for the quality of the original document. Reproductions ♦ 

* supplied by EDRS are the best that can be made from the original. * 



EKLC 



U S DEPARTMENT OF HEALTH, 
EDUCATION & WELFARE 
NATIONAL INSTITUTE OF 
EDUCATION 

THIS OOCUMENT HAS BEEN REPRO* 
OUCED EXACTLY AS RECEIVED FROM 
THE PERSON OR ORGANIZATION ORIGIN* 
ATINGIT POINTS OF VIEW OR OPINIONS 
STATEO 00 NOT NECESSARILY REPRE- 
SENTOFFICIAL NATIONAL INSTITUTE OP 
EDUCATION POSITION OR POLICY 



THE CONSTRUCTION OP AN INSTRUMENT TO MEASURE PROPORTIONAL 
REASONING ABILITY OF JUNIOR IHGIi PUPILS 



A Thesis 

Submitted to the Faculty of the Graduate School 
of the University of Minnesota 



Qrville George Ruud 



In Partial Fulfillment of the Requirements 
for the Degree of 
Doctor of Philosophy 



December, 1976 



CONTENTS 



Page 

LIST OP TABLES 

LIST OF FIGURIilS 

CIIAPTER 

1 THE PROBIiE!^ 1 

Introduction 1 

Statement of the Problem 2 

Hypothesis and Task of Study , 2 

Definitions 3..^ 

Basiic Design 5 

Phase I - Pilot Study 5 

Phase II - Task Interview Testing 5 

Phase III - Paper-Pencil Testing 6 

2 SURVEY OF RELATED RESEARCH LITERATURE 7 

Studies of Formal Operations 8 

Original Studies ^ 8 

Replications of Original Studies . 10 

Related Studies 13 

Batteries of Tasks , 1^ 

Correlational Studies • . . • . 15 

Developmental Studies 16 

Studies of Proportional Thinking 17 

Original Studies . • I7 

Replications of Original Studies . . I8 



Studies of Components of Proportional Reasoning . • 22 

Learning Theory Implications of Some Studies ... 22 

Studies Using Group axxd Paper-Pencil Tests .... 23 

3 



ERIC 



CHAPTER 



Page 



Studies and Precepts of Criteriou-'Referenced 

Testing , c 26 

Original Studies 2J 

Test Design 28 

Task Testing Concerns 29 

Item Collections and Scoring 29 

Written Tasks . • 31 

Studies Employing Criterion-Referenced Testing . . 31 

Analysis Techniques of Validity and Reliability . . 3^ 

PHASE I - THE PI3L0T STUDY 39 

Setting 39 

School Site , 39 

Pupils 40 

Basic Design ' 

Initial Study "... kO 

Task Interviews ^0 

Paper-Pencil Tests . ^3 

Pilot Study Results ^5 

Task Interviews h3 

Paper-Pencil Tests , 

Inrplications for Phase II • . • • . ^8 

PHASE H - TASK INTERVIEW TESTING . • 50 

Setting 50 

SacTple Selection ^ 51 

Basic Design , 53 

Phase II Results . . . " 58 

Inrplications for Phase III 59 

PHASE HI - PAPER-PENCIL TESTING 6l 

Test Versions and Sample Selection . . • 61 

Basic Design 65 

4 



ERIC 



CHAPTER Page 

Phase III Results/interpretations 71 

Version I . . o . . • . a . . . 71 

Version II 76 

Version III A and Version III B 83 

Version IVA 86 

Version IVB 90 

Version VAr 96 

^ Version V B 100 

Summary 10^ 

6 CHARACTERISTICS OF THE INSTRUMENT IO6 

Validity IO6 

Content Validity I06 

Concurrent Validity ..*....* 107 

Construct Validity II3 

Discriminant Validity II6 

Convergent Validity 116 

Suinmary of Validity . . c o . II8 

ft 

Reliability ,118 

Sxmnaiy of Reliability „ 121 

Item Difficulty 121 

Item Discrimination 121 

Stunmary «...«... 128 

7 CONCLUSIONS 129 

Review of Purpose and Procedure 129 

Findings . . • 130 

Educational Implications 130 

Limitations of the Study and Suggestions £or E\urther 

Research y. a . 132 

SELECTED BIBLIOGRAPffif 13!^ 



5 



LIST OF TABLES 



TABLE Page 

3.1 Task Interview Criteria k2 

3.2 Sample Pupil Responses ^3 

3.3 Pupil Average Scores on Pilot Tasks h3 

3,U Rating of Pilot Task Performance • . . • k6 

3.5 Pilot Paper-Pencil Average Scores h6 

3.6 Average Scores of Paper-Pencil Problems hj 

3.7 Contingency Table of Average Task and Paper-Pencil 

Scores ' . ^8 

k.l Socioeconomic Comparison of Blooralngton Junior High 

Schools . . ♦ . 51 

k»2 Coniparison of Characteristics of Initial Sample with 

Total Population 52 

If, 3 Pilot Sample Characteristics 53 

h.h Task Specifications * . 55 

^•5 Pupil Task Averages by Level - 59 

5.1 Test Versions and Ptipil Saiflples . 62 

5.2 Specifications of Paper-Pencil Items Desired 66 

5.3 Content and Stage of Version I Paper-Pencil Items ... 72 
5,1|- Verp.ion II Test Item Content and Stage jk 

5.5 Charact.eristics of Selected Version Items for 

Version II . . . r . 75 

5.6 Performance of ^'Masters" and "Transitional" Pupils 

on Versions II A and II C 79 

5.7 Version II B Results , , 80 



6 



TABLE ' Page 

5.8 Level I Item Results for Grade 3 Pupils on Version 

II A 81 

5*9 Version II Item Decisions 82 

5.10 Version III A Item Decisions, 85 

5.11 Proportional Reasoning Levels of Grade 8 Pupils on 
version IV A 87 

5.12 Version IV A Item Decisions 89 

5.13 Item Discrimination Version IV A • 91 

5-1^ Version IV B Item Responses of Physics Pupils 95 

5-15 Proportional Reasoning Levels of Grade 8 Pupils on 

Version V A • • • 98 

5,16 Version V A Item Responses of Grade 8 Oak Grove Pupils. 99 

5-17 Proportional Reasoning Levels of Grade 8 Pupils on 

Version V B 100 

5.18 Version V B Responses of Grade 8 Oak Grove, Pupils , • , 102 

5.19 Version V B Item Discrimination • . • 103 

5-20 Percentage Correct on Test Versions by Grade 8 Pupils . 105 

6.1 Pearson Correlation Coefficients for Tasks and Paper- 
Pencil Ratings ^ , 112 

6.2 Comparison of Observed and Expected Item Difficulties . II5 

6.3 Item Difficulties in Terms of Performance for h27 

Grade 8 Pupils 122 

6,k Percentage of Correct Pupil Responses in Relation to 

pupil Tested Reasoning Level 123 

6.5 Item Discriiaination • . 125 

6.6 Cross Tabulation of Pupil Response and Pupil Level 

for Item 19 127 

6.7 Cross Tabulation Significance for Level IV Items . . . 127 



7 

o 

ERIC 



LIST OF FIGURES 



FIGURE • ' • ■ Page 

I Level I Item Design and Example 6? 

II Level II Item Design and Exanrple w 68. 

Ill Level HI Item Design and Exaiaple 69 

IV Level IV Item Design and Example . . .. 70 

V Performance Index 78 

VI Grade 8 Pupil Performance on Test Version III A . . . . Qk 

VII Pupil Performance on Test Version IV A . . . • 88 

VIII Pupil Performance on Test Version IV B ........ 9^ 

IX Pupil Performance on Test Version V - ^ ; . .... 97 

X Pupil Performance on Test Version V B . . • 101 

XI Level I Item Design and Example: Test Item 1 IO8 

XII Level II Item Design and Exangple: Test Item 12 ... . IO9 

XIII Level III Item Design and Example: Test Item 2k .. . ..110 

I 

XIV Level IV Item Design and Example: Test Item 16 . . . . Ill 

XV Average Per Cent Success of h27 Eigtith Grade Pupils 

at the Four Test Levels 11^ 



8 



APPENDIX 

A Pilot Study Results and Calculations 

B Task Interview Protocols ' 

C Calculations of Final Test Characteristics 

D Pinal Paper-Pencil Test 

E Pupil Results and Test Improvements in Versions II-VI • 



9 



CHAPTER 1 
THE PROBLEM 



Introduction 

The purpose of this study was to develop a paper-pencil 
test of Piagetian levels of proportional thinking of junior high 
school pupils in the context of physical science. This seemed 
to be a desirable goal for several reasons: 

1. The junior high pupil's proportional reasoning ability 
is of speciatl interest. The age of thirteen, as Inhelder and 
Piaget (1958) showed, is the conimon age for transition to foriual 
thought levels in proportlon?J- reasoning. 

2. Present science curricula in the junior high school 
include such content as density^ quantitative relationships of 
chemical reactions, genetic ratios and the dynamic relationships 
between force, mass and acceleration. The establishment of the 
level of proportional reasoning ability of a class of pupils would 
provide a basis for the selection of appropriate curriculiim content. 

3. Instructional materials and instructional strategies 
used by junior high science teachers are intended to develop, among 
other outcomes, cognitive reasoning. Pre- and post-measures of 
proportional reasoning levels would direct the choice aJid design 
of appropriate materials and strategies of instruction. 

1 

10 



2 

Existing paper -pencil tests do not measure the level of 

proportional reasoning attained by the subjects. Mathematics tests 

whose subtests purport to measure competency in using ratio and 

"J ' 
proportion do so through seeking one correct answer. The other 

answers available for . selection do not have a logical basis and 
make no contribution to determining the subject's level of pro- 
portional reasoning in the Piagetian sense. 

5» Task interviews provide an intensive measure of a 
limited population and are important as research tools. A 
typical, interview requires about 20 to hO minutes and establishes 
a proportional reasoning level for one person in one type of con- 
tent* They are not, therefore, practically applicable for use 
with the lairge numbers of pupils with whom teachers meet. 

6. Experience and techniques used in designing a paper- 
pencil test from bask interviews in proportional reasoning should 
be applicable to other such test design. Rigorous application of 
the principles of criterion-referenced test design has not been 
frequently accomplished. 

Statement of the Problem 

Hypothesis and Task of Study 

It was hypothesized in this study that proportional 
reasoning in physical science may be measured by appropriate 
criterion-referenced paper-pencil testing and that these 
criterion-referenced paper-pencil tests would provide the same 



11 



3 

kind and amount of information that could be obtained through the 
use of other modes of examination « 

The task of this study \Tas to develop a set qf paper.*-pencil 
items to assess the Piagetian proportional reasoning level of 
pupils. The test to be developed shoxild have these character- 
istics: 1) Require a 30-minute testing session. 2) Allow for the 
measurement of large numbers of persoi:s. 3) Use items with 
different science content, k) Have the reliability offered by 
several measures of the same person. 5) Require no expertise of 
Ihe test administrator. 6) Be usable as a source of information 
for determining the numbers of pupils at the various proportional 
reasoning levels and which pupils are at each of these levels. 

Definition s 

Proportions, for the purpose of this study, are "two ratios 
that are equivalent" (Copeland, 197^, p. l60). 

Pr oportional reasoning levels ^ for the purpose of this 

study, were the levels used by Inhelder and Piaget (1958). They are 

listed here in ascending order of complexity, xd.th a description 

of the kind of proportional reasoning pupils might use. 

Preoperational Subject guesses or makes no ordered 

connection between things which change. 

Concrete I Subject cpmpensates in some qualitative ■ 

Operational way and may match direct ordered relations. 

A < B < C < D 
• • • • 

J •< K < L < M 



12 



k 

Concrete II Subject uses a rxxle, usually addition, to 

Operational calculate increase or decrease and may 

order corresponding relations with inverse • 

< B '< C < D 

J > K > L > 

Formal I Subject calculates by multiplying or 

Operational using simple ratios, contrasts ratios 

and can order them^ 5/25 > 2/25 

Formal II Subject uses proportions and recognizes 

Operational the appropriate proportion to be used* 

A/B = C/D or A/B = C/D = E/F. Subject 
will seek and refer to a general rule 
linking the relationship. 

Criterion-referenced testing , for the p\xrpose of thin 
study, is a testing referenced to the criteria of the discrete 
levels of proportional thinking, ' Item design and item selection 
techniques are those of good criterion testing technique.' 

Performance criteria , for the purpose of this study, is 
the level of performance which identified the behavior character- 
istics of a person achieving the level, a master, frcan a person 
not achieving the level> a non-master. Potential masters and 
potential non-masters were identified* by reason of matxxrity or 
measurement. Grade 11 science pupils were supposed, generally, to 
be masters of formal proportional reasoning while grade 5 pupils 
were supposed, generally, to be non -masters. Piaget and others in 
the field suggest that most pupilc would achieve formal proportional 
reasoning only after reaching age thirteen. The performance 
criteria of each proportional reasoning level for task interview 
performance were derived f5:om Piaget 's descriptions. Performance 



13 



criteria for paper-pencil perforaiancG'^^rere set at success on two- 
thirds of the items for that level as disc\xssed in Chapter 5, 

Basic Design 

This study was conduc'^.'^^d in three steps or phases: an 
initial trial or pilot phase^ an intensive task testing phase ^d.th 
1+0 pupils to produce an initial item design, and an extensive 
paper-pencil testing phase with groups that in some cases exceeded 
300 pupils from which the final item set was \«:itten. 

phase I - Pilot Study 

In the pilot study the writer sougjit to assess whether it 
might be possible to identify proportional reasoning levels in the 
pupils and to measure bhem xd.th paper-pencil items* 

Individual interview tasks were administered to a group of 
pupils and different proportional reasoning levels were discerned 
among the pupils. Paper and pencil items derived from the tasks 
were later administered to the same pupils. It was found to be 
possible with tasks to identify the different levels of proportional 
reasoning to which the pupils had developed. These proportional 
reasoning levels were found to be measurable with paper-pencil 
items. 

Phase II - Task Interview T e sting 

In this phase the writer sought to measure proportional 
reasoning levels of a sconple of pupils by interview tasks and to 



6 

use this measure to validate and select an initial set of paper- . 
pencil items . 

Forty pupils vere selected by stratifying all the grade 
eight pupils of a school according to their Lorge-Thorndike total 
score and ' ^ choosing pupils randomly within IQ score levels to 
ensioT' rat; proportional reasoning ability. Extensive 

individual uask testing on this sample was carried out with 
rigorously defined tasks. Paper-pencil items vrere carefully derived 
from the original tasks, written to four levels of proportional 
thinking, and administered to the pupils. From the results of 
this paper-pencil testing an initial set of items was chosen for 
use in Phase HI. 

Phase HI - Paper-Pencil Testing 

In the final phase the writer sought to produce a paper- 
pencil test ^d.th an administration time of approximately 30 
minutes that would measure proportional reasoning levels of 
eighth grade pupils. 

The initial item set was used with large populations of 
grade eight pupils. The Item responses were analyzed for their 
ability to discriminate between proportional reasoning levels. 
Items were revised or replaced and the test was administered again. 
Populations of masters, senior high science pupils, and of non- 
masters, grade five students, were also used. Ten versions of the 
test were used. The validity and reliability of the final version 
were measured. 

15 

o 

ERIC 



CHAPTER 2 

SURVEY OF RELATED RESEARCH LITERATURE 

Because this stndjr was concerned with the development of an 
instniTnent for large scale measure of proportional reasoning ability 

• high pupils, three tirprs of literature were pertinent to 
the study: 1) studies of the formal stages of intellectual growth 
of pupils, 2) studies of proportional thinking, and 3) studies of 
measurenent with criterion referenced testing. 

There is general discourse concerning Piaget's research 
and there are scholarly statements of explanation like those of 
Darley and Anderson (1951), Jensen /i973). Wood (197*+), Beistel 
(1975), Herron (1975), and Mallon 76) where postxolates, guide- 
lines and suggested instructional s ^.tegies are proposed for 
general science teaching and where tie problems of proportional 
reasoning are discussed. Such discor rse and statements are not 
reviewed in this chapter because of their lack of research infor- 
mation. Expert statements and procedural recommendations in the 
literature on criterion testing are reviewed because of their 
interest xo criterion test design. 

Proportional thinking was ciL^sified by Inhelder and 
Piaget (1958) as a formal operational level ability.' The studies 
of formal operational stages are thus of concern. A proportion 
is defined by Mandell (197*+) as "a statement of equality of two 



7 

16 



8 

ratios," Studies of pupil operations* with ratios as well as with 
proportions are reviewed, A criterion-referenced test as viewed 
by Glaser and Nitko (1971) is a test that is deliberately con- 
structed to yield measurements that are interpreted in terms of 
performance standards. Criterion-referenced testing is concerned 
with the measurement of individual and group performance in 
relation- up to established criteria. Professional statements and 
studies here dealing with the design of criterion-referenced tests 
are important to the study. • 

Studies of Formal Operations 

^r ig^inal. Studies 

Ihe description of formal operational thought originated 
-^^srz- -ic.3et (1926). Specific attention to proportional reasoning 
appfiET^ij^i later. 

In The Growth of Logical Thinking , Inhelder and Piaget 
i-9?5) inscribed the study of intellectual stages of growth of 
pcex^i^fia from five: to fifteen years in age. The subjects were 
^^^frliri dually given task intervrsr«ffi. Fifteen such separate 
: : /e;3*: gations were conducted. <lX3cernible levels of concrete and 
■^j-"" li^rml thought were reported fcr each investigation, Piaget 
,15- '2) noted that individxials performing different tasks do exhibit 
difx'eT-^Jv: levels of thought. He suggested that the formal 
or 3rJS:wi<3n tasks should be such that for subjects the situations 
sh.^<;^ involve equal aptitudes or corapai'able interests. 



17 



Piaget and Inhelder (I969) identified the emergence of 
proportional reasoning with the ages of eleven or twelve. Piaget 
(1972) described the formal stage as being related to verbal 
capacities and characterized the formal stage as a stage where the 
Capacity to reason in terms of verballjr stated hypothesis appeared. 
Piaget (1972) described the stages as resxilting in a certain number 
of overall structures which became necessary with development. An 
important problem he noted was the time lag between solution of 
problems in different areas. He reported that at certain ages' 
changing the material or situation used in testing .gave different 
test resultE.. Piaget (196^) identified maturation, experience, 
social transmission and equilibration as factors which explain the 
person's development from one set of structures to another. Such 
development he saw as interaction with things. Knowing an object 
meant acting on it, modifying it and transforming it. It also 
involves interaction with thought. This thought interaction is 
the essence of equilibration. Smeslomd (196^) explained that the 
difference between learning and equilibration is -the difference 
between the interaction of thought with things and the interaction 
of thought with itself. 

In summary, Piaget and his colleagues identified a formal 
stage of proportional reasoning ability emerging in early adoles- 
cence. This stage should be discernible in the child's ability to 
deal with spatial proportions, inertia! speeds, probabilities and 
related concepts in a verbal manner. Performance of the early 



18 



10 

adolescent in proportional reasoning should depend upon the content 
of the problem and the child's experience. 

Replications of Original Studies 

Lovell (1961) repeated ten of the experiments described by 
Inlielder and Piaget (1958) vith 200 British pupils between the ages 
of eight and eighteen. Lovell found that his reauioo ^^onfinr.^d the 
main stages in the development of logical thinking proposed by 
Inhelder and Piaget. Lovell suggested that few junior high pupils 
reach the level of formal thought. He reported that the least able 
students remain at a low level of thought. Some fifteen-year-olds 
were fo^cind not to be at the first level of formal thought. 

zri-cind (1961, 1962) used junior high, senior high and 
college pupils respectively in a series of replication task inter- 
views in the conservation of volume, mass and density. Ellcind 
confirmed Piaget 's finding of a regular age-related order in the 
conservation of mass, weight and volume, but did not agree on 
acquisition of an abstract concept of vol"ume by eleven- or twelve- 
year-olds. He found only about 60 per cent of college freshmen 
tested believed that the volume of a ball of clay remained constant 
when the clay was rolled out into a sausage form, 

Jackson (I965) studied logical thinking in normal and 
subnormal children. He used six of the experiments of Inhelder 
and Piaget with ^8 British children with an IQ range 90 to 100, and 
Ij-O British children with an IQ range 60 to 80» Jackson reported 



10 



that the subnormal children showed only limited increase in 
intellectual development beyond age nine, while the normal ones 
displayed levels of thinking which generally confirmed the age 
level statements of Piaget# 

, DeVries (1973b) tu. Pi..ti^o±au ..^i to compare the per- 
formance of children classed as bright, average and retarded. She 
asked two questions: with children of the same chronological age, 
do higher IQ children perform better and with childx^en of the same 
mental age. do higher IQ pupils perform, better? She reasoned that 
if the answer to both questions is yes, then Piaget tasks measure 
some type of intelligence » In the results, higher IQ pupi3.s out- 
performed others of the same chronological age but older children 
(lower IQ) outperformed others of the same mental age. 

Dale (1970) replicated Inhelder and Piaget' s first 
chemistr:^' experiments using 200 Australian children ftom six to 
sixteen years old. His findings did support the basic structxrre 
of Piaget* s theory of development of logical thinking with age and 
more specifically, the development of combinatorial thinking with 
age. 

Towler and Wheatley (1971) replicated Piaget and Elkind 
conservation tasks 'with college pupils. In the 71 female subjects 
studied at Purdue University, Towler and Wheatley found nearly 
identical, 6I per cent versus 58 per cent, acceptable responses. 

Hollos'/ay (I967) reported that the child's conception of 
geometry was realted to his/her intellectual development level. He 



20 



noted that at the formal operational stage the logic principle 
A = B, B = C therefore A = C appears, 

Koasy (197''^' studied formal op "^-^lonal. thinking using 
three age groups: sixth grade girls, college women and fifty-year- 
old women. Five of the experiments described by Inhelder and 
Piaget (1958) were used. Remits showed the girls to be at the 
lowest level, fifty-year-old women were intermediate and the 
college women at the top. Consistency between age groups was 
reported. Very few attained the formal operational level. 

Bart (1971)5 Lovell and Butt erworth (1966), and Love 11 and 
Shields {lS6j\ using Piaget tasks, substantiated that formal 
operational skills have a .large general factor. All researchers 
used a principal components analysis to analyze the task performance 
of pupils. Bart, in his study, administered four Piagetian formal 
thought tests, three formal operational reasoning tests and a test 
of verbail intelligence to 90 scholastically above average pupils. 
He also established that fonnal thougjit, as measured by Piaget' s 
tasks, has a substantial verbal intelligence component as well as 
a nonverbal inte3JLigence component. 

McKinnon and Renner (1971)5 using adaptations of Piaget 
tasks, found that 50 per cent of college freshmen tested were 
functioning completely at Piaget *s concrete operational level and 
only 25 per cent of their sample coxiid be considered fully formal 
in t:ieir thought. 

21 



;vei^ replication s' .xted. supportea Piaget^s model of 
an ordinal sequence of development. Generally, replication study 
resiilts showed the stages of development came at later ages than 
those reported by Piaget and Inhelder. This observation was also 
that of Howe (197^) reviewed the literature to determine the 
extent of evidence to support the concept of formal thought. She 
found the bulk of the evidence seemed to support that there is a 
qualitative change in cognitive structure or reasoning ability 
beyond the level of concrete operations, no dependence on the use 
of all the binary operations of propositional logic in the new 
structure and more than one process involved in the development of 
logical thinking beyond the concrete level. 

Related Studies 

Studies reported here are related to piaget* s work with 
formal operational .thought. However, these studies are different 
in that they uised different techniques for measurement, used • 
batteries of several tasks or investigated relationships between 
task performance and other pupil characteristics. The general 
studies of cognitive development which were reviewed produced 
results tisc confirmed Piaget levels of development with different 
testing techniques. Linn and Thier (1975) tised a filmed testing 
sequence to measure logical thinking. 

Open questioning was the strategy u^d by Laurendeau and 
Pinard (1962). In such questioning, the warding of the question 
was changed when necess:ary using terms more familiar to the child. 



22 



but with care never to suggest more than was included in the 
instructions. 

Karplus and Kaxplus (1970) used a group presentation with 
elementary school pupils, junior high school pupils, senior high 
school pupils, science teachers and physicists of an Islands 
Puzzle and including introduction of new topics in concrete terms, 
pupil evaluation of an unsatisfactory hypothesis and creation of 
discrepant events, requiring reasoning by contradiction. This 
strategy could be described as midway between the individual task 
and the group paper-pencil tests. An oral description of the task 
was given. The subjects responded in writing. 

Batteries of Tasks 

The use of batteries of several tasks showed that different 
tasks gave different results (Osiki, 197^; D. R. Phillips, 197^^; 
Kaxplus, Karplus and Wollman, 197^5 Lawson, Nordland and DeVito, 
1975). High correlations between tasks were rarely reported. 
Lawson, Nordland and DeVito (1975) found intercorrelations ranging 
from .02 to .55. Almy (1970) reported .32 as the highest inter- 
correlation among a set of tasks. The composite score of such a 
set of tasks was seen as the best predictor by Sayre and Ball (1975) 
and Lawson, Nordland and DeVito (1975). In some cases one or two 
of the tasks alone were fotxnd to he better predictors than the entire 
battery (Lawson and Renner., 1975). 

Wohlwill (i960) used a scalogram analysis of Green (1956) 
to determine the scalability and homogeneity of a set of measured 
tasks. He determined that tasks had varying difficulties. 

23 



15 

Correlational Studies ^ 

The studies of Wohlwill (19^0), Osiki (197U), D. R. Phillips 
(I97l0> Lawson, et al. (1975) BXid Sayre and Ball (1975) previously 
described as studies using task batteries were also invest''.gacion3 
of the relationships between task performance and other pupil t 
characteristics. 

Ball and Sayre (1972) investigated the relationship between 
pupil Piagetian cognitive development and achievement in science. 
They contrasted the grades hl9 science pupils received with their 
level of cognitive development as measiired by five abstract 1[;asks, 
and concluded that pupils are being penalized, by receiving lower 
grades, for not being able to think at the formal operational level. 

Higgins and Gaite (1971) studied adolescent mode of thinking 
on Elkind (I96I) conservation tasks in contrast with thinking on a 
task simulating a familiar real life situation. They found that in 
the 162 pupils, ages thirteen, to eighteen, successful completion of 
the conservation tasks and the situation task were independent, A 
significant positive correlation was established between the mean 
age of the group and the number who used abstract thinking. No 
significant positive correlation was found between mean age and 
successful completion of the Elkind task. 

Raven (I972), in a study of concept development in I60 
kindergarten, grade one, grade two and grade three pupils, found 
that task performance was dependent upon the: 1) inference pattern 
of the task, 2) goal objects of the task, and 3) percepts of the 
task. 

24 



16 

The generalization that Piagetian cognitive level is 
positively related to achievement vas supported by correlational 
studies. Concrete and formal levels as measured by tasks correlated 
vith the abstract performance level in tests of dogmatism (D. G. 
Phillips, 197^) J achievement in science (Ball and Sayre, 1972; 
Bridgham, 19^9; Sayre and Ball, 1975) 5 achievement on commonly 
used achievement examinations (Lawson, Hordland and DeVito, 1975; 
Osiki, 197^) > learning of formal concepts in science (Lawson, 1973) • 

Developmental Studies 

A developmental sequence of levels and their scalability 
was established directly by Wohlwill (I960) who used a scalogram 
analysis to analyze a set of measured tasks. Studies not utilizing 
Piaget tasks or adaptations-^pf them have also supported the 
developmental sequence of levels postulated by Piaget, Nisbet 
{196k) reported that those adolescents in England who had attained 
puberty scored higher on intellectxial and academic achievement 
tests than those youngsters who were still at the puberty stage of 
development. Carpenter, et al. (1975a) reported that in the 
National Assessment of Educational Progress only per cent of 
nine-year-olds correctly identified that a 2x8 rectangle had the 
same area as a hxh square. Almost as many of them chose a 3x5 
rectangle as having the area of the 2x8 rectangle. It would appear 
that proportional reasoning was required here and that the reported 
success is comparable to that found by researchers investigating 
proportional reasoning. Meyers (1970) illustrated in a collection 

ERIC 



17 

of questions showing the nature of the math content of the SAT 
test, that an item dealing with proportional measiirement would be 
answered correctly by 32 per cent of the population taking that 
test. Reichard. Scheiden and Rapaport (19^J+), using sorting tasks 
that were not those of Piaget, found three levels of development* 
At the most concrete level, up to five or six years, children, 
classified objects on the basis of nonessential incidental. features. 

A functional level, where classification was loaade on the basis of 

•J. ^ 

use, extended to the age .of eight, and the abstract level was not 
much used before the age of ten. 

Kohlberg and GiUigan (1971) > in describing their obser- 
vations of the moral development of adolescents, suggested that in 
moral development one stage of formal operations is reached at age 
ten to thirteen years and the more complete stage at aroimd fifteen 
to sixteen. 

Studies of Proportional Thinking 

Original Studies 

A special concern of this study was the nature of 
proportional thinking as one attribute of the formal operational 
level of thought. 

Proportional thinking was described as one attribute of the 
formal operational level of cognitive development by Inhelder and 
Piaget (1958). Their task interviews to test proportional thinking 
included the simple balance, a cart on an inclined plane, the 



26 



18 

projection of shadows and a spinrilng disc testing centripetal 
force. They commented that they v;ere able to repeatedly observe 
that proportional reasoning was not acquired until pupils were at 
the formal operational level of cognitive development* 

Proportional reasoning had been investigated by Piaget 
previously in the areas of space, speed and probability in which 
it was concluded that the age for such proportional reasoning and 
for formal operational thought was twelve to fourteen years. 

Replication of Original Studies 

A collection of research studies replicated the original 
research of Piaget in proportional reasoning. These, studies 
affirmed the existence of stages and the scalability of proportional 
reasoning tasks, described the schema of proportional, reasoning, 
tested new measurement approaches and explored correlations between 
proportional reasoning and other pupil characteristics* The studies 
generally found proportional reasoning being acquired at older ages 
than Piaget reported. 

Lvmzer and Pumfrey (1^66) used tasks they designed 
involving such things as matching lengths of cuisenaire rods, 
pantograph, beam balance and sijcailarity judgments of objects. They 
reported that they found that proportional reasoning, xinaccompanied 
by physical actions tols rarely used by average svibjects below the 
age of f if1:een and that younger children solved some of the tasks 
by successive addition. 



27 



19 

Wollman and Karplus (197^) investigated intellectual 
development beyond elementary school, with 1^50 seventh and eighth 
grade pupils in Orinda, Calif ornia^ They studied children' s use 
of ratio in solving beam balance, proportional length, proportionate 
size of shadows and pxilley ttirning rate tasks. All tasks were 
designed by the authors • They concluded that to test proportional 
thinking, tasks would have to.be devised that would apply the 
ratio concept in familiar situations. 

As reported by Steffe and Parr (1968), Lunzer (1965) 
studied the relationships of developmental thinking with logical- 
proportion (verbal analogies) and with mathematical proportion 
(metric equivalent ratio pairs)* Lunzer' s measurements of the 
difficulties of these two types of tasks for subjects from nine to 
seventeen years confirmed that numerical proportions and verbal 
anaJLogies did require formal level thinking, 

Steffe and Parr (1968) studied the development of the con- 
cepts of ratio and fraction in fourth, fifth, and sixth grades of 
elementary school, measures were used to designate a high, 
middle and low group of pupils at each grade. An ability- 
stratified sample of pupils was chosen. Six paper-pencil tests 
were used, four on a pictorial level and two on a symbolic level. 
They reported that there was little correlation between the ability 
of children to perform successfully in proportionality situations 

at a symbolic level such as 6/15 = and their ability to per- 

form successfully on proportionality situations based on ratio or 

28 



20 

fractional pictorial data. Also, whenever the pictorial data, 
•which displayed the proportionalities, were not conducive to 
solution by visual inspection, the proportionalities were difficult 
for fourth, fifth, and sixth grade children to solve. 

Shepler (I969) studied teachability of probability under- 
standings. The subjects were pupils chosen from a population of 
67 sixth grade pupils. All were volunteers and were above average 
ability. In a pretest task post test approach they did acquire 
probability concepts. ^ 

Hensley (197^) studied proportional thinlcing in children 
from grades six through twelve. Fif1:een female and fifteen male 
pupils from each of the sixth, eighth, tenth and twelfth grades 
were tested \rith fovir tasks; beads, inclined plane, switches, pro- 
jection of shadows. Hensley' s results generally svrpport the 
findings of Piaget. He reported a scalability of levels of pro- 
portional thinking, a positive relationship between grade level 
and task scores. No relationship was found, however, between sex 
and task scores. No correlation between tasks were calculated. No 
validity or reliability measures of tsisks were reported. 

Kavanaugh (197^) generally confirmed the theories of Piaget 
in the development of the concept of speed in children. He used 
five Piaget type tasks ajsd determined the hierarchy among subcon- 
cepts of the concept of speed. Tliirty-six pupils, each from grades 
six, seven and eight, participated. The average age of fomal 
operational thought of the sanple was thirteen years and four 

29 

o 

ERIC 



months, A relations^.p bet-^^en IQ :;d *c^:^r?orrnaiice on the r^r^^s 
\i^s establi 

Carpenter, al- (1-975^) identified, two areas of piipil 
-ifficxilties in tue National Assessmen-?- of Nathemtics which may 
relate to proportional reasoning. R-^ .ported that the concept cf^ 
fraction was shown to be difficult ti. -jiniersta:;^r.i and use. A 
consumer probleir. z:hat would be solvaT: wiiia proportional reasonxBi; 
^'/as correctly answered by fewer than hO per cent cf the seventeen- 
year-olds or young ad\ilts. 

Raven (197'+) reported research studies he and his pupils 
had performed over the past seven years concerned with facilitating 
logical operations in elementary school and junior high school 
children. He saw the period of formal operations occurring between 
the eleventh and fourteenth years and proportional thinking, 
probability thinking, and correlational operations appearing during 
this stage. 

Holloway (196?) reported that pupils at the formal 
operations level were able to double an area and that a transitional 
age for this was a,bout twelve years. 

Novak: (197^-), in a review of science education research of 
1972, sximmarized cognitive development research as supporting 
Piaget's theory. He further saw the general need for established 
validity in tests that were being used and overall the need of 
setting research in appropriate learning theory. 



30 



ior..,r Contponents of ProportlC'i^;JL- Reasoning 

;":r::obing into the nature oi proportional rear*f;ning, Lovell 
*. / '8jzyt^ i:^^Tth (19^6) made a prinuipal conrponent factor analysis 
or a z-^i- cf tvrenty tasks as performed by 60 pupils of average to 
.i. -e ij^*^- age ability, from nine to fifteen years old. They •PmT-r?. 

t. ::chema of proportions de^^ands on some centrals inteUectr^e 
i — Vy/ ;;^c±ch is behind performance on all tasks involving pro- 

, -yet specific abilities contribute to the ability to use 
J • :>r- -i-mality in particular tasks. Also, tasks involving rat±D 
c len- Z^ess on the control intellective ability than tasks involving 
pr T^ton, Further, they stated this proportional reasoning 
sbUzlt- was found to appear at fourteen years of age in some pitpils, 
\Thz2^. o-t even fifteen years of age. some 50 per cent of the sample 
mi^g^rt not vise proportional reasoning. 

Eiis distinction between ratio and proportion vas further 
coTI-rrr-rated by tiie results of the Minnesota State Assessment of 
l/la.^\^ssriztizz . In t5ie Minnesota Assessment of Statewide Performance 
in Ttothematics, no objective specifically dealt with proportional 
reasoning yet as reported by Adams et al. (1975 X Two items testing 
proportion IIH3 and IIJl state per cent correct was respectively 
l6*l si2i 21.2, "WiMle an item involving ratio, VB-1^ was 
ans'w-ered. correctly by 6l.2 per cent. 

Jy. : ^caing Theory Implications of Some Studies 

Lovell (1970) described two types of proportion, metric 
proipsTtions involving the recognition of the equivalence to two 



23 

ratios and the schema of proportions rju-cL as thennnl capacity • 
This schema of proportions involves sescona order operations, vhich 
axe operations on operations. Margena.-. (1950) saw something like 
these levels of complexity of Lovell's. j^nrgenau postulated that 
concepts of physical reality should be (cLssifled by the mqtijiod 
through which they are attained and th^ tiroiance they are removed 
from reatlity. 

Rosskopf, et al. (197O), as a isiiriJLt of observations, stated 
that the Piagetian proportionality schema is a general structure of 
actions or operations that can be applied to analogous situations • 
This suggests a general knowing with some different performances 
depending upon content but not proficiency in one and zero In 
another, 

Renner and Lavson (I973), in reflecting on tbeir research, 
suggested that mental structures represent a more or JLess highly 
organized mental system to guide behavior. Structures, in their 
\inderstsaiding, actually represent our knowledge. 

Studies Using Group and Paper-Pencil Tests 

A collection of research by Robert Zarplus and his 
colleagues has been based on group tests of prt>portional reasoning. 
Included in this collection is a survey (Karplns :^d Peterson, 1970), 
a longitudinal study (Kaarplus and Karplus, 1972), an investigation 
of cognitive style (Karplus, Xarplus and Wollman^ 1^7^), and a 
study of the use of ratio in differing tasks (Wo"^t^sti and Karplus, 
1971^)- 

32 



2h 

Zxi er;:.... case* .'-uxjects in classrooir ups v/ere given pages 
\rltYi ±nformat..::a and questions by one of tia.:. :~hors or a traimed ' 
assistant. Th-^ exper.ui2i^rrt:er explained eac±: prrjjlem and carried out 
some demonstrations measurementrs • The gues:::ions asked for some 
ans\»/er and a i-eason ±br the, answer. Subject's answers ^^ere 
categorised according to these previously designed categories 
(Karplus and Peterson, 1970, pp. 81^1.-815) ► 

The survey involved ll6 fourth and fifth grade suburban 
pupils, 82 suburban sixth grade pupils, 95 -inrV-^^i sixth grade pupils, 
7;^ eight to tenth grade s\iburban pupils, 123 ei^t to tenth grade 

pupils and 153 eleventh and twelfth gradn suburban pupils. 
The survey results (Enrplus and Petsrson, 19^) :showed that the 
older urban and suburban groups were better able to solve the ratio 
problem than their younger colleagues. 

Interpreted ±n terms of Pia^t levels, measured performance 
for 75 eighth to tenth grade pupils was Preoperational, 15 per cent; 
Concrete Operational, per cent; Formal Operational, 36 psr cent. 
These group results substtantially compare with tifciose repjof?::^ for 
task measures. 

Zn ths longituc±ial study, Karplus and Karplus (1572) studied 
the growth of proportional reasoning of a group of 155 s^irth, 
eighth and eli^erth grsSa suburban pupils ovar two years of time. ' 
About one-third of the ;rr:5>ils showed nc: charge, in level. Tlie 
changes that did occiir confirmed the hierarcsay of proportional 
reasoning ability as measured by the group test. 

33 

o 

ERIC 



25 

The aeV'-erhh grade in the school had three instructional 
groups: "slOsT,"' 'average" and "fast." The three groups performed 
very differeinn^ when measured in eighth grade. The pupils of the 
"slovr' group -i—re virtually no progress, m the "fast" groiip only 
three pupils f£3,iLed to reach the Piaget Formal Reasoning Levels 
The pupils in -'L ie "average" group made some progress, but nothing 
as dramtic a^: ^3hat of the "fast" group, 

Karplur; . Kairplus and Wollman (197^) studied cognitive stjle 
in the personal preference of persons for procedxires for solving 
ratio and proportion problems. 

Two forms of ratio tasks were administered to 6l6 piniils .in 
grades four through nine. Results suggested that persons who do::iiot 
USB proportional reasoning will ixse strategies that are suggested 
by^ the task's presentation. Specifically, when a task involved 
cisrrpaarrsci: -jf two viewed objects, the subject without proportional 
rs2Sonln^ .riSren qualitatively compaxed xhe two in a manner involving 
srslii::^;* VIisi a task involved one object and numerical data for 
:roEiparxson, the subject without proportional reasoning often usesL 
SCTie additive riHTproach toward solution. 

The rHAilo value itself might have had an effect, Tha 
:r2±ic 3/2, ^5siiich lies between, one and two, tended to increase 

percentage additive responses, A ratio of 2/1 prompted 
prrrportional iri^ead of additive reasoning, a ratio of 5/2 csLised 
some pupils to :nse approximate ratios of two or three, or become 
confused. 

34 

o 

ERIC 



26 

Vfoether tlie task itself affects the level of proportional 
reasoning, vc^ t: subject of Vlollman and Karplus' (197^) latest 
study. T-riey XTive;si:igated the responses x>f h^O seventh and eighth 
grade pur-:ls tn 's±zz problems iihat required proportional, reasoning 
and represented differing degrees of concreteness. The study 
suggested that Troportional reasoning lievel was dependent on the 
content of the i;ask and the type of ratio or proportion involved. 

In th^ study paper-pencil items vrere used, A contrast of 
paper-per-cil mid. group interview results demonstrated that group 
and paper -penczIL tests gave substantially the same resiilts. 

Grant and Renner (II975) explored the use of written state- 
ments ef explanation for rraltiple choice item resp.onses as a means 
cf iderrzifying different levels ^of .irsasoning ability. Pupils, from 
three d: i'f'3ent biology secrtrions at one Targe Oklahoma City area 
high SCI. were asked to respond a twenty-minutas rail^iple 
Choi ce t<?%t Slid gi^ a written explssatim for select ^ each 
ai3.s?sr. "Zhe ssiae pupils were adminiTrhered iSze separa'rion. of 
•vBTzdrable.^. Piaget task. Results from the study were as^lyzed through 
chi- square technique and levels of significance were reviewed. 
Good agreement between task and written mea3\ires were established. 

Studies and Precepts oi Criter:''.':rr?:T-*Refer^cHd !i?e3ting ' 

Measurement with C2rLterian--ref erenced testing Is a com- 
paratively new approach in research^ A czmcem of tr±H study is 
xo demonstrate an exemplary approacbrto criterion^ij-irsfeirenced test 



27 

design. Literature, that contained precepts for good test con- 
struction 3^ well as stadies of test construction, item design and 
appropriate statistics ies toII as examples of criterion-referenced 
and other paper-pencil "zest design, vas sought to be included in 

the revie-fAT,. 

Original SiTadies 

Ijests, dealing :specif.ically with proportional reasoning at 
the lev-2l of junior high, were not numercsxs in published test 
collectior.£. VJithin the rO citations arsxlable in May of 197^ for 
mathenatjics tests, grade :z^^ven and aiiio^e tir: ^e test collection of 
Educaticn:2"L I:sstning ^e rvlre ,, no stzx±: test: vtss found. Some sub- 
tests contain propoxnriozHll reasoning- rrrvn^Hmgnts, In the Content 
Evaluation ^i^riss ; iMatb^natics Test ?Drci.T hy Gilbert Ulness CI969, 
grades: ze-nrn throupi Trr^%- HoughtDn lSfT::in^ there is a subtest on 
ratio. ±n rj2s lovs: Tesxrr pf Basic SLcLLl-a, Levels Edition Forma ^ 
and 6 by Hieronymis, .iil971, grades three through eight, 

Houghton MifxHin, thsre is a subtest, rs:uio and proportion. Ratio 
and prroortion is one of some tw'eaty topics, of the McGraw-Hill • 
Basi::: IkzJJx: System; W=^:rim=^±cs Tesr. bj^ .Alton L. Raygor, CI97O, 
grades r^even throu^ ~nirteen, CIB/:i!'!c:L*'::2.w-Hill; no subscores on 
ratio and proporiiion aine available. 

problems concemirng ratio airi proportion is one' of eight 
topics of emphasis ±n th£r Mathematicis Iziventory III Basic Skills 
of Problem Solviirg , cl97G , grades four through twelve, American 
Test i TP (f: Company, buir. .crab scores are:- -arvailable . 

36 



Test items in ratio and proportion, vhen available, ask for 
a single correct answer and do not identify the subject's reason 
for a response • No items or subtests relate the score obtained to 
a subject's proportional reasoning levels 

Test Design 

Glaser (1963) saw achievement test scores as offering 
primarily two kinds of info3rraation» One, the degree to which the 
pupil has attained criterion performance • Two, the relative 
ordering of individuals with respect to test performance. 
Criterion-referenced tests were seen as having an absolute 
standard and providing explicit information on what individuals • 
can do independent of the performance of others. Norm-referenced 
tests were seen as having a relative standard in coinparison to 
others and providing no information on the degree of proficiency 
of an individual. They further differ in their construction in 
that items within criterion-referenced tests would have similar 
difficulties while items within norm-referenced tests would have 
items with a range of difficulties. 

Hieronymus (1971) equated criterion-referenced tests with 
mastery tests and saw their contribution £n the monitoring and 
assessment of insbructional strategies and outcomes. 

Ebel (1971) saw major limitations of criterion-referenced 
testing, the fact that as such tests do not tell us all we need to 
know about achievement, are difficult to develop on any sound basis 

37 



29 

and are only possible for a small fraction of important educational 
achievements. 

Task Testing Concerns 

Chittenden (197^) saw task testing as requiring open ended, . 
exploratory questioning* He felt that questioning children 
according to the instructions of a standard protocol -would force 

the observer to conclude that they were, by and large, able to 

conserve* Using a flexible, exploratory method, he foiand it was 
easy to probe to find the children were preoperational. 

Flavell (1963) saw the need to allow the pupil to identify 
or select reasons or rationales rather than give totally their 
explanation. 

Item Collections and Scoring 

Fremer (1972) suggested that the judgment of achievement 
of mastery be based on achievement of a proportion of some group of 
items tied to a single objective. The sampling error associated 
with the selection of only a single exercise would pose serious 
problems of interpretation. 

Fremer* s ■ (1972) statement in generating cutting scores was 
to use an operational approach. Ratings and scores would be 
collected for a sample of studies. That level of test performance 
which best discriminates among pupils judged to be above or below 
the minimal competency level would be sought. A cutting score on 
the test could be selected that would lend to the most correct 
classification in the sample. 

38 



30 

Easley (197^) foiind a conflict "betveen the drive for 
protocol uniformity to produce reliability and the need for 
flexibility to allow the necessary depth for probing. He felt 
that the quest for reliability, which results in rigid formats, is 
doomed to generate many errors in the identification of cognitive 
structures "because it lacks the flexibility needed for probing. 

Rowell and Hoffman" (1975) stated that a group measure was 
needed. The individually administered tests developed by Inhelder 
and Piaget (1958) were viewed as prohibitively time consuming for 
use in the normal classroom situation. They saw that a group test, 
easily administered, readily marked, and yet retaining as many as 
possible of the attributes of the original Plagetian tasks was 
needed. They tested I93 pujdls with a group chemist3:y task and 
189 pupils with a group pendulum task. 

No validation was made of the group task with individual 
tasks; no reliability was measured. The product moment correlation 
coefficient between the group measvires vtbs reported as r = .56. 

Studies, which involved the use of more than one task 
(Lunzer and Pumfrey, I966; Hensley, 197^), reported different 
performances for the different tasks. Some tasks were easier than 
other tasks and correlations between tasks when reported were in 
the range .25 to .k2. 

D. R. Phillips (197^) identified these common errors and 
misapplications of Piaget found in the literature: 1) training . 
studies in which children axe taught verbal responses to specific 

39 



31' 

tasks, 2) interviewing techniques in- which the investigator does 
not ast i;he child for reasons for his choices and 3) scoring 
criteriE.- for reasons, when asked, that do not incorporate 
reversxholity or logical necessity, 

Goodyear and Renner (1975), in a preliminary study of . 
r=2aso2is i)upils gave for multiple choice item responses, found 
£::uess±iig- to be the highest category after thought that they knew 
-^lae xdght answer. Also overall 21.8 per cent of those having wong 
snswers iihought they were able to ^justify, them. The authors from ^. 
"::;nis indication of probable partial knowledge suggested that a 
fesi involving pupil reasons for answers would be useful. 

Wxitisn Tasks 

Karplus and Zarplus (197^) discussed jjaterview versus 
written tests, Shey saw the pupil's school vrork as more closely 
gin^jl to the \^ritten task situation than to the clinical 
interview, 

Studies Employing Criterion-Referenced Testing 

DeAvilla and St rut hers (196?) developed a group measure of 
piapil level with subtests in conservation, causality, relations 
and logic, A cartoon format based on thirty or so situations from 
Piaget experiments was used. Test quality was described in terms 
of homogeneity ratios and reliability coefficients. Tests resulting 
had limited homogeneity and good reliability. The reliability 
values, Cronbach's Alpha (1951)> were conservation, ^69Hj causality, 
.550; relations, .001; logic, .22?; total test, .717. I 

40 



32 

The domain referenced assessment of Hively^ Patterson and 
Page (1968) is a process of generating items out of a matrix or 
grid expressing the contents and behaviors to assess with the 
assumption that all relevant contents, behaviors and related 
factors can be defined from a domain or a universe of objectives. 
Basic item shells would next be constructed to generate items to 
meet the prespecified criteria. Such prescribed procedures were 
foUoTCd by Bart (1972) and Gray (1970) where items originated from 
item shell descriptions for their stem and distractors. 

DeVries (1973a) through factor analysis, probed the 
relationships among Piagetian, achievement and intellectual assess- 
ments. She concluded that Piagetian measixres represent some 
aspects of intelligence and achievement which are not included in 
standardized assessments. DeVries (l973b) further reported that 
psychometric tests and Piagetian tasks seem to reflect two 
different kinds of intelligence. 

Robertson and Richardson (1975) studied the problem of 
whether the conservation of a derived quantity in physics is de- 
pendent upon the conservation of constituent fundamental 
quantities. A random sample of 25 boys and 25 girls from each of 
grades seven through ten were participants in the study. This 
sample stratified for age and sex represented 25 per cent of the 
pupils in a coeducational high school in ah outer Sydney area. 

Testing was done using a procedure where the materials and 
operations were demonstrated clearly to the pupils. A question 



41 



33 

which vas printed on the question paper vas repeated • The subjects 
were required to indicate their response on the paper by circling 
yes or no. Reliability of the testing was established through 
test and retest of a random sample drawn from grades seren and 
eight, individxially and group processes were suitable. Testing was 
conipleted in two days. Chi-square analysis was applied to identify 
significant change. The writer established 'that conservation of 
constituent fundamental quantities was a determinant in conservation 
of a derived quantity. 

McLeod, Birkheimer, Fyffe and Robison (1975) accomplished 
the development of a collection of criterion validated test items 
to measure the science processes of controlling variables, inter- 
preting data, formxilating hypothesis and defining operationally. 
The development proceeded from writing a collection of face 
validated items which were administered to 56 individual competency 
measured pupils. 

Pearson product moment correlation coefficients between 
scores on the individual criterion measures and scores on the 
selected group test items ranged from .535 to .705 and all 
correlations were significant at the .001 level. 

An attempt was made to develop and validate a Piagetian- 
based written test with successful- use of the logic of specific 
Piagetian tasks defined as the criterion by Gray (197O). Ninety- 
six randomly selected nine- to sixteen-year-olds, stratified by 
age, were individually presented the Piagetian tasks of pendulum. 



3h' 

balance, and combinations and group administered a thirty-six item 
logically equivalent written test. Results indicated that a 
criterion-referenced approach to constructing a Piagetian-based 
written test of cognitive development is possible and that the 
average age of change from concrete to formal operations is 
consistent \rith previous research. 

Analysis Techniques of Validity and Reliability 

Lawson and Renner (1975) developed content based reasoning 
level tests. Face validity was established by six prominent science 
educators with competence in science and experience in Piagetian 
theory. Sxaminations \7ere content validated by the classroom 
teachers in the respective subject matter areas. Reliability of 
each subject matter examination was determined by using the 
Speannan-Brown split half correlation technique. The reliabilities 
were: biology exam, O.76; chemistry exam, rj{ = 0.71; physics exam, 
rjj - 0.S9, However > test items had no described theoretical basis 
or construct validity. 

Glaser and Nitko (1971) suggested that criterion-referenced 
tests may not directly enrploy classical measures of reliability 
since moiiy of the item and test statistics employed with norm- 
referenced tests are dependent on the observed variance of the 
total test scores. Criterion-referenced tests are expected to 
have little variance in total test scores. 

Hambleton and Novick (1972), in reviewing the definitions 
for criterion -referenced tests of Glaser and Nitko, Harris, 



43 



35 

Steward, Bormuth, and Hively, Patterson and Page, stated that 
common to criterion-referenced tests is the definition of a well 
specified content domain and the development of procedures for 
generating appropriate samples of test items • Criterion-referenced 
tests may often be, multidimensional while made up of "unidimensional 
subscales. 

Carver (1970) suggested that the reliability of a single 
form of a criterion-referenced device could be estimated by 
administering it to two comparable groups. The percentage that 
met the criteria in one group could be. compared to the percentage 
that met the criterion in the other group. He further suggested 
that the reliability of a criterion-referenced test shoiald be 
assessed by coinparing the percentage of examinees achieving the 
criterion on parallel tests. 

Zeiky (197^) described a reliability index as an indication 
of the consistency or stability of a test score. A reliability 
index, in his description, technically indicates what percentage of 
the score variance is true score variance. 

Livingston (1972) proposed a measure for criterion-referenced 
test reliability which includes a special case, norm-referenced 
reliability. Livingston reasoned that the basic difference between 
norm-referenced and criterion-referenced measurements is that when 
using norm-referenced measvires, one wants to know how far a 
pupil's score deviates from the group mean and when using 
criterion-referenced measures one wants to know how far his score 



44 



36 

deviates from a fixed standard. Therefore, each concept based on 
deviations from the mean score should be replaced by a corresponding 
concept based on deviations from the criterion score. 

Harris (1972) objected to the Livingston coefficients 
becaixse it appeared identical to a conventional reliability- 
coefficient, when that coefficient was based on two populations 
with means equally distant above and below the criterion score.. 
Livingston replied to this objection enrphasizing that criterion- 
referenced test score interpretations do not require that the 
criterion score be seen as a *^cian of score distribution. 

A test-retest approach to criterion-referenced test 
reliability was the suggestion of Zeiky (197^-). The percentage of 
cases that shift classification, between successive administrations 
of the same test or between parallel terms, would be the meastire. 

Content validity of a criterion-referenced test mxxst be 
high. Popham and Husek (I969), Kriewall (1969), Carver (1970) and 
Hambleton and Kovick (1972) all state this in some way. Popham and 
Husek saw this as the primary measure of validity. 

Zeiky (197^0 discussed the methods of cutting scores. 
Among these he included the method of empirically using preselected 
groups which within a school system, particularly at the elementary 
years, could be the grade levels. Masters could be those pupils 
who have talcen a course or by age have had the experience. Non- 
masters would be from some lower grade. The criterion-referenced 
test would be administered to both groups and the distribution of 

45 

o 

ERIC 



37 

scores obtained, A cutting score then would be selected that best 
discriminated between the two groups. This idea of cutting scores 
and enrpirical examination of levels gives direction to the 

examination and design of a developmental level test* 

I 

Zeiky (197^) applied the ideas of classical test theory to 
criterion-referenced tests. He felt it should be possible to apply 
traditional methods if score variance is "built-in" by selecting 
two pretest samples kno^^^a by independent means to be split evenly 
above and below mastery level and pooling them into one group. 

V/oodson (197^) had similar views and eta-ted that for 
criterion-referenced tests ^ item analysis ^4iad tes4 development must 
be done on observations representative of ids observations mthin 
the range of interest on the characteristic cf^^ interest that is 
above and below the criterion level, 

Zelky (197^), Kriewall (1969) and Ivens (I97O) saw that 
item difficulty measiires can be used to improve a set of intended 
homogeneous items, Ivens suggested that any one of a set of homo- 
geneous items that has a difficulty widely discrepant from others 
in the set shoxald be treated mth caution, 

Zeiky summarized the recommendations concerning item 
discrimination indices use of Popham and Husek (1969) and Nitko 
and Hsu (197^) that one shoiild consider score variance as well as 
the index. If normal discrimination indices are Iot^ because score 
variance is low, there is no problem. If score variance exists in - 
reasonable amounts and item discrimination is still low, there is 



46 



38 

likeDy to be a problem. If discrimination indices are negative, 
there is definitely a problem which should be corrected* Pxi i^dex 
of item quality vras suggested by Besel (1973) based on estimates of 
the probability that a ^'non-master" will answer an item correctly; 
the probability that a "master" will have an item wrongs The index 
identifies with high indices those items with the most Information 
for dividing pupils into masters and non-masters* Estimates of the 
imfiex -cnn be obtained by administering the: item to groups known by 
indeper^ient means to consist oil non -masters and masters 
respecr±vely. 



EKLC 



47 



CHAPTER 3 



PHASE I - THE PILOT STUDY 

Phase I of this study v/as a probe into the nature of 
proportional reasoning levels and a trial of the possibility of 
meeisuring proportional reasoning levels vith a paper-penci? test* 

Setting 

School Si±e 

Ziie pilot study vas conducted in Penn Junior High School 
in Bloo3L±ngfcon, Minnesota, The city of Bloorainffton had three 
junior high schools. Penn Junior High School pupils ranked the 
highest of all junior high schools in the mean composite score on 
the lawa. Tests of Basic Skills • With regard to socioeconomic 
status, Penn Junior High School ranked second among the three 
junior high schools, 

Penn Junior High School "was chosen because of the interest 
and cooperation of their science teaching staff • The -writer had 
vorked with this staff to review their goals for science teaching. 
The study had its origin in questions this group had about the 
problems their eighth grade p-upils were having while using 
proportions in physical science. 



39 



48 



ho 

Pupils 

Glasses of two of the four grade eight physical science 
teachers vrere used by the witer in conducting Piagetian task 
interviews with pixpils. The teachers of these classes pointed out 
pupils with lo^-r and with high class performances so that the writer 
mi^t select pupils \i±th some range of ability. The pupils in the 
sample had completed some three months of the half-year course at 
the time of task interviewing and had completed all of the course . 
at the tijne of paper-pencil testing, 

Basic Design 

Initial Study 

The writer had tested four grade eight mathematics classes 
with the Mr, TpH and Mr, Short ratio problem (Karplus and Karplus, 
1970), Pupil answrs followsd the pattern found by Karplus, 

Discussions 5 with Robert Karplus^ with Clarence Boeck and 
with John Stecklein, encoixraged the writer to develop a paper- 
pencil instrument. 

The writer sought in a pilot study to gain some indication 
of probable tasks to use, task testing experience, and appropriate 
content for proportional reasoning testing. 

Task Interviews 

Piagetian task interviews were conducted using a total of 
25 tasks with a total of 25 pupils. Each group of five pupils 
performed a set of five tasks. That is to say: pupils A-E 



49 



kl 



performed tasks 1-3 and pupils P-J per^zrmed the next five tasks 
and so on through the full 25, No pupiH performed more than five 
tasks but each task was performed hy five pupils. This is tabled 
in the Phase I res\ilts later in the cha;^-i:Br, 

Each tsLSk involved physical oli^^ts and materials. The 
pupils observed and handled these ob jeers and materials. The tasks 
involved physical and geometric proporfcix)ns. Direct, inverse, 
direct-as-square and inverse- as -square relations vere fill included 
in the interview tasks. Each interview followed a defined question 
format that was structured after the Chittenden ( 197^1-) approach of 
probing questions culaLi^^stLing in a direct question asking for the 
student's reasoning. 



Task: 

Txe rods are measured for 
the pnpil. 

'The longer one is set up 
and its shadow measured. 



Materials : 

Cuisenaire rods, 
8 cm orange and 
h cm yellow 

Ruled grid. 

Lamp - Hi intensity 




50 



k2 



Questioning: 

Introduction: The orange rod you can see is about l6 units 
long. The yellow one is about 8, When I set up the orange 
rod and the lajrrp, the rod has a shadow 10 vinits long. 

Prediction: The number of units of shadow I woxild get if 
I set up the yellow rod in the same way without moving the 
lamp. 

Appendix B includes similar descriptions of the final version of 
many of these tasks. 

Five task interviews were conducted with each pupil. The 
interview and each pupil's response were recorded on audio tape as 
well as being recorded in notes. Responses were scored into 
categories according to the criterion behavior exhibited and given 
a numerical value. This scoring is described in Table 3-1- 

Table 3.1 . 
Task Interview Criteria 



Stage Criterion Behavior and Example * Score 



Preoperational Subject guesses— or makes no connection 0 
between how things change aiid some nale. 
Pupil example: "I guessedw" 

Concrete I Subject compensates in some qualitative way, 1 

Operational Pupil example: "Because it's bigger," , 

Concrete II A rxole, usually addition, is used to 2 

Operational calculate the increase or decrease. 

Pupil example: » ' 

, "I added 10 + 6 l6 so 2 + 6 = 8," 

Formal I The subject calculates by multiplying or 3 

Operational using simple ratios. 

Pupil example: 

"10/16 X 8 5, I multipl:).ed," 

Formal II The subject uses proportions, k 

Operational Pupil example: 

"5/8 =10/16, It's proportional." 



EKLC 



51 



Sample pupil responses and their scoring are shown in 
Table 3 •2, Student answers were recorded in notes and in audlb 
tape recording. The grading of responses was done from notes and 
replaying the tapes. 



Table 3.2 
Sample Pupil Responses 



Answer 


Reason 


Score 


5 


I guessed 


0 


About k 


It has to go down 


1 


2 


It goes down 6 


2 


5 


I multiplied IO/16 x 8 


3 


5 


Because it goes the same vreiy IO/16 is 5/8 


k 



. Paper-Pencil Tests 

The twenty- five tasks were then written as paper-pencil 
items eoid all, items were given to all 25 pupils. Because the 
writer questioned what form to use for the items, distractors for 
the paper-pencil items were written in the four different forms 
illustrated. The item forms were distributed throughout the test. 



52 



kk 



Flag Pole 



m '\\\ 




Introduction (stem): 



Predict (question); 



The orange rod you can see is about l6 
units long. The yellow one is ahout 8* 

When I set up the orange rod and the laoip 
the rod has a shadow 10 units long. 

The number of \mits of shadow I would get 
if I set up the yellow rod in the same 
way without moving the lamp. 



Form I 

Pupil solves the problem for his answer which he records, and 
selects a description indicating his method of solution. 

Reason 



Answer you found 



a - I guessed ; 
b - I added 
c - I niultiplied 
d - I used a. ratio 



Form II 

Pupil selects an answer and an appropriate reason. 

a - 5 5/8 = 10/16 
b - About k^ short is half as taJLl 
c - ^ I subtracted a little less * 
d - 2 I subtracted 6 



Form III 

Pupil selects an answer and a reason from identical answers 
but different reasons. 

a - 5 because 5/8 = IO/16 

b 5 because IO/16 x 8 = 5 

c - 2 becaxise 8 ~ 6 = 2 4 

d - 2 because it should be smaller 

Form IV 

Pupil selects a method. Select the approach you would use. 
a - I guess 

b - I use a proportion 

c - I would add 

d - I would multiply 

53 



^5 

Pilot Study Results 

Pupil results on tasks of this pilot study vere analyzed to 
corf irm the prohahle existence of levels of proportional reasoning" . 
azLii to examine the success of their measurement with designed 
ts3ks and paper-pencil items. 

Task Interviews 

Levels of proportional reasoning were evident In the : 
results. As shown in Table 3.3^ pupils did have a range of task 
scores. 

Table 3.3 
Pupil Average Scores on Pilot Tasks 

Level 0 Trans. ^ I Trans. II Trans. Ill Trans. IV 
Pupils 1 2 k 3 h .28 1 

^ Trans. = Transitionsil 

The pupil results were also used to analyze the discrimi- 
nation power and the consistency of the tasks. 

All pupil task scores were arranged in the pattern shown in 
Table 3.J+. Here it can be seen that task I-l Thermometer show 
discrimination for cnly one pupil scored. This suggested that this 
task should not be used in further testing. 

The underlined scores (3^ 0) are scores which differ by 2 
or more from the average score that pupil received. Such a wide 
difference suggested that this task may not have been measuring. 



54 



1^6 



Table 

Rating of Pilot Task Performance 



Tasks 



Pupils 


I-l 
Thermom- 
eter 


1-2 
Folds 


1-3 

BB Cr 


Recipe 


1-5 

Sq A 


Average 


A 


2 


3 


b 


3 


2 


2.0 


B 


2 


U 


3 


3 


3 


3.0 


C 


2 




k 




k 


3.6 


B 


0 


0 


0 


3 


0 


.6 


E 


2 


3 


i 


0 


3 


1.8 



the sari^fi thing as other tasks. This recipe task was rewitten 
before it was used again. Description of all tasks, paper-pencil 
items and piipil scores may be obtained from the writer. 



Paper-Pencil Tests 

Levels of proportional reasoning were present as found in 
the paper-pencil testing. These levels are summarized in Table 3.5. 

Table 3.5 
Pilot Paper-Pencil Average Scores 



Level and (Range of Average Scores) 
0 I II III IV 

(0 - O.l^) (0.5 - l.h) (1.5 - 2.1^) .(2.5 - 3.h) (3.5 - h.O) 

Pupils 



There no perceptible difference in pupil scores \dth 
different distractor formats • Pupils who regularly solved problems 
by guessing wu.\d candidly indicate that they guessed when asked or 
would solve the problem ijci that way when a solution was required. 

The items lacked good consistency, had a wide range of 
discrimination and showed variation in difficulty. In Table 3.6 
it was noted that itemiS 2.2 and 3-3 had average scores of 3.0 while 
items 2.U, 2.5, ^.2, 5,3 and 5.5 each had an average score of 1.9. 

Table 3-6 



Average Scores of Paper-Pencil Problems 



Problem 


Average Score 


1.1 


2.8 


1.2 


2.7 


1.3 


2.8 


l.k 


2.k 


1.5 


2.k" 


2.1 


2.1+ 


2.2 


3.0 


2.3 


2.5 


2,k 


1.9 


2.5 


1-9 


3.1 


2.2 


3.2 


2.h 


3.3 


3.0 


3.h 


2.9 


3.5 


2.2 


k.l 


2.0 


h.2 


1.9 


h.3 


2.0 


k.k 


2.6 


h.3 


2.8 


5.1 


2.6 


5.2 


2.1 


5.3 


1.9 




2.7 


5.5 

' ■ - ■ ■ 1 


1.9 



56 



U8 

That a relationship between task scores and paper-pencil 
scores existed vms evidenced bjr the contingency analysis in 
Table 3.7- The hypothesis that the relationship here was due to 
chance was rejected after the chi-square statistic was computed. 
Chi-square here 19-97- For nine degrees of freedom this 
hypothesis may be rejected for 98 of 100 cases. This calculation 
is found in Appendix A. 

Table 3*7 



Contingency Table of Average Task and Paper-Pencil Scores 



Average Paper- 


Average Task Score 




Pencil Score 


1 


2 


3 


k 


Totals 


1 


1 


1 






2 


2 


2 




2 


1 


9 


3 


2 


1 


3 


1 


7 


h 






1 


2 


_3 


Totals 


5 


6 


6 




21 



Inriplications for Phase II 



Paper-pencil items did appear to measxare proportional 
reasoning and the results were comparable to those of other 
researchers (Karplus, Karpius and Wollman, 197^). This implied 
that a thorough research study to develop a paper-pencil test 
should be atteinpted. 

Variations between task measures were evident. This 
suggested that exacting descriptions should be made of the task 
interviews and three task measxires based in the literature shoxild 



57 



h9 

be given to all pupils tested vith tasks in the next phase. A 
larger niimber of pupils should be involved in task testing in the 
next phase in a way to give more pupils at each reasoning level. 

The results suggested that the paper-pencil items would 
need much refinement. There appeared to be no clear support for 
pupil solution of the problem or selection of just an answer over 
just selecting the description of the method of solution. It was 
reasoned that paper-pencil items shovild be rigorously designed, 
written in sets for each of the four levels axid empirically 
improved through large volume and repeated testing. 

Certain questions, including the higher ordered proportions, 
direct as cube, inverse as square, appeared to be at a different 
level. Proportions involving circular areas gave very different 
results . 

It was decided that proportions should not involve circular 
areas; the items with higher order proportions shoiald be 
carefully screened. 



58 



CHAPTER h 
PHASE II - TASK INTERVIEW TESTING 

This phase of the study was the task testing of a selected 
group of ho eighth grade science pupils. This phase accomplished 
a Piagetian task measure of these pupils* proportional reasoning 
ability. The pupil responses to task measures and the pupil 
performance on task measvires were the basis for construction and 
selection of paper-pencil- items for the test instrument desired in 
the study. 

Setting 

The writer, employed by the Bloomington School District, 
chose to use Bloomington as the site for the study because of the 
convenience of working within the district aJid the relevance of 
this study to the Bloorain^on science program. 

Demographic and pupil test data from elementary schools 
of the junior high attendance areas were used to establish socio- 
economic and pupil ability rankings. This information was 
gathered by the school district in gaining Title I Elementary 
Secondary Education Act (ESEA) designation of target schools. 
Data of this sort were available from the Information Office of 
the Bloomington Schools, Table U,l shows a composite of the 
rankings of elementary schools by socioeconomic status and by 

50 ■ 

59 



51 

pupil achievement test grades listed for each junior high 
attendance area. 

Table 1+,1 

Socioeconomic Comparison of Bloomington Junior High Schools 

Composite Elementary School Ranking 
School \ Socioeconomic Pupil Tests 

Penn 8 7 

Portland 18 17 

Oak Grove 13 13 

Olson 7 8 

Oak Grove Junior Hi^ seemed to be a school that woxild 
provide a median type of pupil population. At Oak Grove / pupils 
vare modulaxljr scheduled with science-mathematics a scheduled 
instructional block. It vas possible at this school to give task 
interviews within a pupil's scheduled science time or independent 
study time. An 8 x 8 foot room off the science office was used 
for the task interviev/s. In this room were a table, a chair for 
the subject, a chair for the interviewer, a tape recorder to record 
task interviews and 19 small boxes, each holding the equipment for 
one of the tasks. An average of 25 minutes was spent vrLth each 
pupil in completing all five tasks. 

Sample Selection 

A random sample of Uo pupils was selected from the Oak 
Grove grade eight pupil population of U85 pupils. This random 



60 



52 



sample had the follo-vang composition as compared with the total 
population as shown in Table 

TablevJ4..2 

Comparison of Characteristics of Initial Sample 
with Total Population 



Bloomington Oak Grove ' Sample of hO 
Grade 8 Grade 8 Oak Grove 
Pupils Pupils Pupils 

Number 

i male 51 51 70 

% female. ks ks 30 

Average Lorge Thorndike IQ 110 110.5 lll.k 



Because of the number inequalities in the male-female 
composition of the sample, it was judged to be atypical. It was 
decided, therefore, to stratify the population by sex and ability. 

The pilot study results were reexamined for correlations 
between proportional reasoning and the verbal, nonverbauL and total 
IQ scores of the Lorge-Thorndike measure. Piagetian levels 
obtained from task interviews were found to have the following 
product moment correlation coefficients with Lorge-Thorndike IQ 
measures: nonverbal, .67; verbal, .71; total, .71. The calculation 
of these values is found in Appendix A. 

The intent was to select a sample of approximately equal 
numbers of boys and girls and to have a range of abilities to 
ensure that all levels of proportional reasoning would be 
represented. Pupil nonverbal Lorge-Thomdike scores were mapped 
out (see Table U.3)# Choice was made by ntunbering consecutively 



61 



53 

Table l+,3 
Pilot Sample Characteristics 



Lorge-Thorndike 
nonverbal scores 


Boys 


Sample 
Girls Boys & Girls 


All 
Oak Grove' 


118 and above 


5 


8 


13 


ll^9 


99 to 117 


11 




15 


2l^7 • 


98 and below 


_5 


_7 


12 


_86 


Totals 


21 


19 


1^0 


U82 



all persons (boys and girls) within the Lorge-Thomdike level and 
then selecting with computer generated random nvimbers. When a 
randomly identified student was found to have moved from the 
district, another random number was used in the same manner. 



The levels and the sanple sizes within the levels were 
chosen, not to ensure a sample representative of all grade 8 pupils, 
but to ensure a sample with pupils at each of the four levels of 
proportional reasoning. Deliberately ^ larger proportions of pupils 
were thus chosen from the lower and from the higher Lorge-Thorndike 
ranges. 

Basic Design 

The task interview phase was used to measure proportional 
reasoning levels of ^-0 pupils through intensive interviews wherein 
the pupil would manipulate physical objects while completing the 
proportional reasoning tasks the pupil was assigned. The inter- 
viewer followed a general fonnat but asked open and probing 

62 



5h 

questions after the maimer of Chittenden and Bybee. The inter- 
viewer's format was reviewed by Dr. Edward Chittenden during the 
October 197^ Educational Testing Service Criterion-Referenced . 
Testing Seminar and by Dr, Roger Bybee in meetings with the writer 
in December 1973- 

Task items involved proportionality with direct, inverse, 
direct-as -the -square and inverse-as-the-Square proportions. The 
cognitive content of the task was obtained from a variety of areas. 
Physical tasks were those arising out of some physical law or 
action. Geometric tasks were those arising out of geometric 
figures. The nature of these task items is summarized in Table k.k. 

Task 1, the Shadow Task, and Task 19, Incline, were adapted 
by Hensley (197^) from the work of Inhelder and Piaget (1958). 
Task 2, Mr. Tall, was a task used by Karplus and Karplus (1970). 
Task 3, the Sled Task, was an adaptation of a task of Piaget (1970). 
Task 15, Pulley, and Teusk 16, Ruler, were those designed by Karplus, 
Karplus and Wollman (197^). WoUman, Hensley and Karplus extended 
permission for the writer* s use of these tasks. The first three task^ 
termed "literature tasks," were given to all hO subjects. The 
other tasks, largely designed by the writer and termed "derived" 
tasks were each given to at least five subjects. 

This pattern of task assignment used with pupils meant 
that the first five pupils had tasks 1, 2, 3^ ^ and 5. The second 
five pupils had tasks 1, 2, 3, 6 and 7. The third five pupils had 
tasks 1, 2, 3> 8 and 9; the fourth five pupils had tasks 1, 2, 3, 



63 



Table k,\ 
Task Specifications 



Proportionality 



Title 


Direct 


Inverse 


Direct as Square Inverse as Square 


Cognitive Content 


1. Shadow 




Physical 




light 


2, Mr. Tall 


Geometric 






Scaling 


3. Sled 






Physical 


Motion - Acceleration 


4. Angle 


Geometric 




• 


Similar 


5, Balance 


Physical 






Lever 


6, Flagpole 


Physical 






m * It 

Light 


7, BB Square 


Physical 




Geometric 


Area 


8, Pattern 






Geometric 


Scaling 


9, Frosting 






Geometric 


Inverse Square Law 


■ A Mil 

10, Paint 


Physical 




• 


Chemical Proportions 




Zlljjdl'vCJ. 








12. Boyle 




Physical 




P/V - Gas Laws 


13, Population 






Physical 


Density 


Ik, Probability 


Physical 






■ Statistics 


15. PuUey 


Physical 






Displacement 


l6. Buler 


Pl^slcal 






Displacement 


17. Weight 


Physical 






Statistics 


18, Light it Shadow 


Physical 






Light 


19, Incline 


Physical 






Simple Machines 


Totals 


n 


2 


2 . 1 • 






Physical 


Physical 


Physical Geometric 





2 :c!i 2 

Geometric Geometric 



*4 

ERIC 



65 



56 

10 and 11; the fifth five pupils had tasks 1, 2, 3, 12 and 13; the 
sixth five pupils had tasks 1, 2, 3? 1^ and 15; the seventh five 
pupils had tasks 1, 2, 3, l6 and 17; and the last or eighth five 
pupils had tasks 1, 2, 3, 18 and 19, 

Interview tasks were designed with written description of 
the testing protocol, the scoring and. the setting. Protocols were 
to 'be open ended with the examiner making notes, asking for certain 
pupil responses and recording the interview on tape. 

The description for Task 1, Shadows, follows. The complete 
set of task descriptions may be foimd in Appendix B. 

1, Projection of Shadows (Hensley, 197^ ) 

Thinking Tested: 

■ Schema of Proportions 
Inverse proportion - Physical 

Material: 




A screen, 30 cm x 30 cm, is used to observe the shadows. 
The shadovTS are: made by three wire rings, 3*0 cm, 6.0 cm and 9.0 
cm in diameter. Each ring has a support wire. The lengtli of the 
support \r±Te is such that the center of each ring is 12.5. cm above 

66 



57 

the bottom of the support \d.re. The rings axe made from different 
colors of vrLre as follo\«fS: 3.0 cm (white), 6.0 cm (red), 9.0 cm 
(black). The rings are held vertically on a meter stick by optic 
bench screen holders. The meter stick has oiily marks at each 10 cm 
length. Each mark is labeled with the following letters: N, R, M, 
K, G, F, A, B and 0. A clear light bulb is sirpporbed at one end of 
the beam. The center of the bulb is 12.5 cm above the top oi the 
beam. The light is turned on and pff by connecting or discon- 
necting the cord to the 6 volt battery. One meter stick marked in 
centimeters and millimeters is provided for the pupil to use# 

Introduction: 

"Here is a board, a light and a screen. . I can put up one 
ring (6.0 cm) on the board {ex 50 cm) and then when I turn on the 
light (do it), I .Get a shadow of the ring on the screen." 

Question: 

Initially seek out predictions of the effects of ring size 
and ring position on the shadow with questipns such as: "What 
would you predict will happen if I use this smaller (3.0 cm) ring?" 
"VJhat else covLld change the size of the shadow?" "How?" Do what 
is suggested. 

Culminating Question: 

"How might I make just one shadow using two rings?" "Explain 
why this works?" 



58 



Scoring Criteria: 

Stage Criteria Score 

I The subject represents the shadow in the way the 0 
object appears to him. He does not perceive how 
the shadow is formed on the screen. 

ZHA The subject recognizes that the size of the shadow 1 
depends on the size of the object. His knowledge 
goes no further. 

IIB In addition to the ring-size dependence of the 2 
shadow demonstrated in IIA, the subject suggests 
qualitatively that the distance affects the shadow 
size, the closer the object is to the screen, the 
smaller the shadow. 

IIIA The subject quantitatively compensates between 3 
distance and shadow size, between distance and 
diameter, but is not generalized as a rule. The 
subject begins to measTu^ 
source. 

IIIB From the start the subject measures both the h 
distance from the light source and the diameter 
of the rings . He looks for a nxmerical 
hypothesis based on the divergent structure of 
the ligiht rays. The subject is able to state in 
a numerical foim the general, relation for the 
two rings to have just one shadow. 



Phase II Results 



Pupil responses to task interviews were collected in pupil 
notes, obseDTver notes and audio tape records. Pupil responses 
were scored by the writer according to criteria as described. For 
each task in Appendix B, overall calcxilation of correlations 
between these task scores, was not made but postponed for analysis 
with the final results of Phase III. The scores and the averages 
were \ised at that time. 



68 



59 

For a qualitative analysis of results, a composite listing 
vas made of all pupil scores, the average scores on literature based 
and derived tasks, and the overall average • The task scores in 
this phase were more c :>r?sistent than task scores in the pilot 
phase. The average pUpil task levels are listed in Table 
These averages cluster at Level ll. Some pupils did achieve every 
level. 



Table 1+.5 
Pupil Task Averages by Level 



Task 




Level 






I 


II III 


IV 


(0-0. i^) 


(0.5-1.1^) 


(1-5-2.1^) (2.5-3.1^) 


(3.5-1^.0) 


Literature tasks 0 


6 


^2 9 


3 


All tasks 0 


k 


22 11 


3 



The difficulty of the .literature tasks was estimated by 
averaging the pupil scores obtained for each of these three tasks. 
They were respectively; task 1, 2.^-0} task 2, 2.30 and task 3> 2.08. 



implications for Phase III 

Recorded pupil responses were retained for building the 
paper-pencil items of Phase III. Pupils on task 3 had a low 
overall average. Because it was suspected that task 3 had a 
higher difficulty, multiple choice answers were designed with 
clear illustrations of the motion that the item questioned. 



69 

o 

ERIC 



It was not conclusive that any tasks should be eliminated. 
All tasks vere written as items at each of the fo\ir levels of 
proportional thinking, insofar as possible • All of these tasks 
were the content of test items. Some 76 items were used for the - 
first testing in Phase III* 



70 



CHAPTER 5 
PHASF III - PAPER-FENCnj TESTING 

Phase III of the study ms the design and selection of 
items for a paper-pencil instrument to measure proportional 
reasoning. Paper-pencil testing started '.ith a set of 76 items 
administered to the 1+0 pupils who had been tested with interview 
tasks in Phase U, The content of the items \ra.s that of the 19 
Phase II tasks* As many as four items were written for each task 
covering the four proportional reasoning levels. 

Pupil performance was used to judge item effectiveness in 

the selection of a set of 2k items from an initial set of 76 items. 

• i 

This selection and the continued item improvements made through 
further testing are described in this chapter. 

Test Versions and Sample Selection 

Ten versions of the test were administered. Each version 
was an improvement over previous ones as a consequence of the 
changes in items or the replacement of some items with others. 
Table 5.1 summarizes the characteristics of each version, the 
pupil samples that were tested and the relationship between the 
versions. 

Version I consisted of 76 items over the four levels of 
proportional reasoning. This was administered to ko eighth grade 

61 

71 

EKLC 



Table 5,1 
lest Versions and Pupil Samples 



Test 

Version Characteristics 


Number 


Pupil Sample 
Description Selection 


I 


76 items 


IfO 


Grade 8 
wansiiiionai 


Pupils selected randomly idthin 
inree inieiiigence leveis lor 
task testing 


IIA 


2\ items 

Q euCn Ql H- 16 V SIS 


29 


Grade 8 
Liansuionai 


Randomly selected from 385 


II B 


12 items per pupil in a "matrix" sajiiple 
, Mother 6 from among Levels 11, III and IV 


27 


Grade 5 
non-masters 


OiE total class 


lie 


Same test for all 

f\ fl'f T.pvp1 T* ^ of TpVjal TT* fit T^pvpI TTT* 

12 at Level IV 


77 


Chemistry 
pupiis 
"masters" 


Chemistry classes at one high 
scnoui 


III A 


29 items; 6 at each Level I, II, III and IV. 
Five additional items for Level 11 


393 


Grade 8 
"transitional" 


All Grade 8 pupils in one school 


IIIB 


12 items per pupil in a "matrix" strategy. 
The same 6 Level I for all. 

fiyiA+Viov* n r»}irtQO)l i^wtn To^rolo TT tiy\A TTT 
fiilOW^I U Ulv^pcil iiUiil itt;V6J.P 11 cUm 111 


30 


Grade 5 
"non-masters" 


One total class 


IV A 


30 items, 6 at each Level I, II, III and IV; 
acidiuioncu Levei j,ii uems 


77 
195 


2 sepajate 
Grade 8 groups 
"transitional" 


77 pupils selected randomly 
from 385 

195 as half of the total 
Grade 8 population 


IV B 


30 Items, 0 at eacn level and b aaaitlonal 
Level IV items 


69 


Physics classes Physics classes in one high 
"masters" school 


VA 


30 items, 6 at each level and i additional 
Level IV items 


te7 


Grade 8 


All Grade 8 pupils in one school 


VB 


Identical with V A except for the 
substitution of 2 items aiid rescoring 




"transitional" 





ERIC 



73 



pupils selected randomly within three intelligence levels for task 
testing. 

Version II A, which resulted from review of Version I 
results, had two related verions, II B and II C. Version II A, 
the basic set of items, consisted of 2h items, six items at each of 
the four proportional reasoning levels. Twenty-nine pupils, 
randomly selected from a group of 385 grade eight pupils, were 
tested with this version* 

Version II B had three forms designed so that responses of 
a class of 27 fifth grade pupils, supposed non-masters, to Level I 
items could be anaJyzed thoroughly and some measurement could be 
made of the other items. Each of the forms had twelve items. Six 
of the items in each form were , the six Level I items dTrom Version 
II A. The additional six items were selected from each of the 
other three levels. 

Version II C was a 30 item adaptation of Version II A that 
was used with 77 high school chemistry pupils, sicpposed masters, to 
thoroughly analyze Level IV items. An additional six Level IV 
items were used along with the Viersion II A items in order to 
consider some replacement of Level IV items. 

Version III A, which \re.s administered to 393 grade eight 
pupils, \7as the result of the improvements in Version II. Twenty- 
nine items were used in this version, six at Level I, eleven at 
Level II, six at Level m and six at Level IV. The additional 
Level II items were intended for consideration for improvement of 
Level II. 



Version III B, administered to 30 fifiih grade pupils, was 
designed as two forms with 12 items each* Six Level I items of 
Version III A and three items each from Levels II and III of 
Version III A were used in the two forms. A special purpose of 
this testing ^-ras the improvement of Level I items. 

Version TV A was a set of 30 items that was administered 
to 272 eighth grade pupils. Seventy-seven of these pupils v/ere 
randomly selected from the 385 grade eight pupils of a school. 
The additional 195 pupils were the grade eight pupils enrolled in 
second semester science classes in another school. The test con- 
tained six Level I items, six Level II items, twelve Level III 
items and six Level IV items. Overall item improvement was 
intended -from this testing as was the possible replacement of some 
Level III items. 

Version IV B contained most of the items used in Version 
IV A with the exception that six items were used at Level III and 
twelve items at Level IV. The responses of the supposed masters 
Who took the test, 69 high, school physics piipils, were used to 
improve the uEPer levels of the test. 

Versions V A and V B were administered to k27 grade eight 
pupils, essentially all the grade eight pupils in one^ junior high. 
The purpose of this testing \jas to develop descriptive statistics 
regarding the final version of the test. Version V A and V B were 
the single test that was to be the final test version of 2h items. 
Thirty items were used. The 2h items that were scored as the basic 



75 



65 

test consisted of six for each of the four levels. Six additional 
Level IV items vere included. With the replacement of two of the 
original Level IV items "by t'wo from the additional six items which 
were part of Version V A, Version V B came into being iipon rescorihg 
the papers. 

Basic Design 

The paper-pencil testing was carried out to sel6ct a final 
form of 2k items, six items at each of four levels. An initial 
set of 76 items './ere written. Each item of the initial 76 item set ' 
was constructed according to procedures for good item construction 
after Mehrens and Lehman (1972). Only procedures 5-9 inclusive 
were pertinent. 

5. Prepare a table of specifications 

6. Decide upon the type of format to be used 

7. Prepare test items 

8. Evalmte 

9. Revise 

The table of specifications used was that to be. found in 
Table 5-2 . It can be seen that the items were to sample all levels 
and to be written in both a geometric and physical context. Content 
of the test item came from the nineteen tasks used in task inter- 
views. Pupil responses to these tasks were helpful in fozming the 
items. 

The paper-pencil test items, the item key and the 
distractors were written to specific criteria from Inhelder and 
Piaget (1958). This was in accord with the specifications of 



66 



Table 5.2 

Specifications of Paper-Pencil Items Desired 
Stage and Level 

Context Concrete Stage Formal Stage Approximate- 





Level 
I 


Level 
II 


Level 
III 


Level 
IV 


Totals 


Geometric 


a / . 


a 


a 


a 


30 


Physical . 


a 


a 


- a 


a 


50 


Total 


20 


20 


20 


20 


80 



8- Exact numbers in each context were not established ahead of time. 



Glaser and Cox (1968) for criterion-referenced measw. . As Glaser 
and Nitko (1971) prescribed, the classes of behavior for each level 
were specified as clearly as possible before the test was 
constructed. 

Paper-pencil teist item fomat, criteria and test exaanples 
are illustrated by level in i^igures I, 11, III and IV. The key is 
located as the first answer in these examples. In practice, 
however, the locations of the key and distractors were varied by 
setting out all possible combinations of the first four ansvrers 
and then randomly assigning them. 

AnsTi'/er "E," I have no answer , was always placed as the 
last ansv/er. Thus, a pupil need not enter a guess when no answer 
seemed plausible. 



EKLC 



77 



67 



Item Design Concrete I Stage (Level l) 



Key 



Distractor 



Stage 



Concrete I 



Score Criteria 



Reasoned 
Guess 



Distractor 
Distractor 
Distractor None 



Reasoned 
Guess 



Illogical 
Guess 



Subject comgpensates in a qualita- 
tive vay/ May niatch two direct 
ordered relations or use addition 
or subtraction to contrast or 
calculate ratios 

A > B > C > D 
• • • • 

J > K > L > M 

Subject makes erroneous connection 
but one which involves appropriate 
elements 

Subject makes reverse ordered 
connection but involves elements 

Subject guesses or makes no 
ordered connection, nonsensical 

Subject makes no response 



I 
II 



Item Example (IIC^) 

A car moving at a constant 
speed of 30 mph will, if 
pictured at one second • 
intervals, look like: 

Answer 



A. I because it moves equal distances each 
second 

D. II because it is increasing its distance 
C. II because it changes 

B. None of these because it is moving 

E. I have no answer 



^ ^ 








6^ 



Stage 

Concrete I 

Reasoned Guess 
Reasoned Guess- 
Illogical Guess 
None 



Figure I. Level I Item Design and Example: Test Item 5 



78 



68 



Item Design Concrete II Stage (Level II) 
Stage Scox^e Criteria 

Concrete II h 



Key 



Distractor Concrete I 



Distractor 



Reasoned 
Guess 



Distractor 
Distractor None 



Illogical 
Guess 



Subject orders corresponding 
relations ('with inverse) 

A > B > C > D 
ft • • « 

J < K < L < M 

Subject compensates in some 
qualitative, non-ordered way 
( or direct - not inverse ) 

Subject makes erroneous connection 
but one which involves elements 

Subject guesses or makes no con- 
nection between how things change 

Subject makes no response 



Item Example (l^C^) 

(■■ 

These nature hunt groups are chosen for a nature hike. The teacher 
with the most pupils to help is: Mrs. Andrews - 5 pupils 

Mr. Dentoni & Mrs. Felk - 8 pupils 

'"^ - 6 pupils 



Mr. Holt 



Answer 



A» Mr. Holt because 6/l is larger than 5/l is 
larger than 8/2 

C. I>lr. Denton and Mrs. Felk because they have 
the most pupils 

B. Mr. Denton and Mrs. Felk because 2/8 is 
larger than 1/5 is larger than l/6 

D. Mrs. Andrews because she has fewer pupils 

E. I have no answer 



Stage 

Concrete II 

Concrete I 

Reasoned Guess 

Illogical Guess 
Kone 



Fij^ure II. Level II Item Design and Example: Test Item 21 



79 



69 



Item Design Formal I Stage (Level III) 

Stage Score Criteria 

Formal I h 



Key 



Uistractor Concrete II 

Distractor Concrete I 

Distractor Guess 

Distractor None 



k Subject multiples, uses simple 
ratios, contracts ratios axid can 
order them 5/25 2/25 
5/25 X 10 = 2 

3 A rule, usually addition or 

subtraction, is used. to contrast 
or calculate ratios 

2 Subject compensates in some 
qualitative way 

1 Subject guesses or makes no 
connection bet'^^en how things 
change 

•0 Subject does not respond 



Item Example (lOFi) 

Jim uses k heaping teaspoons of Tang powder with an 8 oz, glass of 
vrater. How much Tang is needed for the same mixture with 12 oz. 
of water? 



Answer 

A. About 6 teaspoons because 12/8 x k tsp, = 
6 tsp, 

B. About 8 teaspoons because 8 oz, + U oz. = 
12 oz. and k tsp, + k tsp. = 8 tsp. 



Stage 
Formal I 

Concrete II 



C. More than k teaspoons because there is more Concrete I 
v;ater 

D. h teaspoons because it is the same mixture Guess 

E. I have no emsv/er None 



Figure III. Level III Item Design and Example: Test Item 11 



80 



70 



Item Design Formal II Stage (Level IV ) 



Sto.ge 



Score Criteria 



Key 



Formal II 



Distract or Formal I 3 



Distractor Concrete II 2 



Distractor Concrete I 1 



Distractor None 0 



The subject cguLcxilates using pro- 
portions and recognizes the appro- 
priate proportions to be \ised: 
A^C A^C^E 
B D ° B D F 



The subject multiplies or uses 
simple ratios 

A rule, usually addition or sub- 
traction, is used to calculate 
the increase or decrease 

The subject compensates in some 
qualitative way 

The subject guesses or malces no 
connection between how things 
change 



Item Example (?F2) 

Sketch #1 of a house is 5 pencil widths or 2 pennies high. Sketch 
#2 of this house is not shoim* SI : :ch #2 looks the same but is 8 
pencil vTLdths high. How high must sketch #2 be in pennies? 




Answer 



^1 



B. About 3 because 



2 3.2 

5 = "B- 



C. About 3 because x 8 = 3.2 

5 

A. About 3 because 8-5 = 3 

D. About 3 because it has to be more 

E. I have no answer 



Stage 

Formal II 

Formal I 

Concrete II 
Concrete I 
None 



Figure IV • Level IV Item Design and Example: Test item 22 



81 



Phase III Results /i nterpretations 

Each testing period was followed by an analysis of resxilts 
and an improvement of the item set. Deficient items were modified 
or replaced. In the first stage, item analysis consisted of com- 
paring the overall results with expectations. In later stages of 
analysis the response patterns of masters and non-masters were 
contrasted. In the last stages a biserial r was calculated to 
evaluate the correlation of scores of masters with the levels 
assigned by testing and a report of the mean scores of item masters 
and non -masters. 

Version I 

Item writing for Version I produced 76 items. Table 5.3 
summarizes the content and levels of these items. Seventeen items 
were vnritten at the Concrete I stage, 17 at the Concrete II stage, 
18 at the Formal I stage and 2^1 at the Formal II stage, in total, 
20 items were written with geometric context and 56 with physical 
context. Usually four items were written from each task althoiigh 
as many as five and as few as one were written. 

It was intended that the final planned array for Version II 
after item selection would be that of Table ^.k. 

Observed pupil performance \tb^ used to select itens for 
Version II. The test \jb.s taken by kO pupils who had been selected 
to give performance at every level of proportional reasoning and 
who had demonstrated such proportional reasoning in task testing. 



82 



Plagetian Stage 
Table 5.3 F2 or G2 Formal II 

Fl Formal I 

Content and Stage of Version I Paper-Pencil Items C2 Concrete II 

C]_ Concrete I 



Proportionality 

P=Physical Inverse Direct Inverse 

G=Geoinetrical Mult' n of Mult 'n of Ordering as as 



Content 


Context 


Relations 


Relations 


proportions 


Direct 


Inverse Square 


Square 


1. Shadow 


P 


C^ 


^2 


^1 








2. Mr. Tall 


G 


^1 


^2 






'2 




3. Sled 


P 


^1 












h, Angle 


G 


^1 












5, Balance 


P 


^1 












6. Flag Pole 


P 






h 








7, BB Square 


G 


^1 








¥z 




8. Pattern 


G 






h 








9, Frosting 


G 


h 












10, Paint 


P 


^1 












11, Speed 


P 


^1 


C. 
c 


h 


h 







TaWe 5.3 (continued) 
Content and Stage of Version I Paper-Pencil Items 



Proportionality 

Paphysical Inverse Direct Inverse 

GsGeometrical Mult 'n of Mult 'n of Ordering as as 



Content 


Context 


Eelations 


Relations 


Proportions 


Direct 


Inverse 


Square 


Square 


12, Boyle 


P 


^1 




\ 










13, Population' 


P 


^1 




h- 










IK Probability 


P 


^1 














15. Pulley 


P 


^1 




h 


\ 








16, Ruler 


P 


^1 






\ 








17. Weight 


P 


^1 




h 


h 








1 

18, Light & Shadow P 






h 










19, Incline 


P 








h 










56 Physical 
20 Geometrical 


17 C^ 


17 C2 


18 




2^2 







2lfF2 

76, items 



Table 

Version II Test Item Content and Stage 



Content Stage (Levels) 

Concrete Formal 
Level I Level II Level III Level IV Total 



Geometric & 6 6 6 2h 

Physical 



These general decision rules, as shown in Table 5»5> were applied: 

1. Choose items which approximate these levels of 
pupil performance: 

Level I 50 - 60 ^ correct 
Level II ^0 - 55 ^ correct 
Level III 30 - >5 % correct 
Level IV 20 - 35 ^ correct 

Such percentages were chosen from recognition that 
correct answers to four of the six levels would be 
mastery. It vas also expected (Hensley, 197^; Karplus 
and Karplus, 1970) that most pupils woxild achieve 
Level I, 70 per cent votild achieve Level II, 25 per 
cent Level III and 10 per cent Level IV. 

2. Use items with a variety of content and li^vv'e both 
geometric and physical contexts within the selected 
items . 

3. Change items in accord with Piaget theory and item 
design requirements for answers which have defined 
characteristics . 

Because a combination of these rules was applied, an item was not 
rejected upon failure to meet any one mile. 



87 



75 

Table 5.5 

Characteristics of Selected Version I Items for Version II 



Level I Items 
















Test Item 


1C3_ 






^1 


11C3_ 


li|C3_ 


Average 


^ Correct 


53 


56 




63 


58 


53 




Decision 


Use 


Change 


Change 


Use 


Use 


Use 




Level II Items 
















Test Item 


IC2 


3C2 


5C2 


6C2 


IIC2 


1J+C2 


Average 


^ Correct 


38 


35 


28 


25 


60 


68 


1^2.3 


Decision 


Change 


Change 


Change 


Change 


Use 


Use 




Level III Items 
















Test Item 






lOF^ 


IIF^ 




18F3_ 


Average 


% Correct 




38 


55 




20 


25 


39.0 


Decision 


Use 


Use 


Change 


Use 


Change Change 




Level TV Items 
















Test Item 


IF2 


hF^ 


^2 


IIF2 




I9F2 


Average 


9i CcTTect 




2k 


2k 


28 


10 


31 


21.8 


Decision 


Use 


Use 


Change 


Use 


Use 


Use 





88 



Version H 

Version II, prepared through the selection process 

previously described, consisted of a basic set of 2h items • 

Version II vas used in a different form vrith each of three groups:- 

Version Characteristics Population 

II A 2k items; 6 from each 29 randomly selected 

level; 2 forms • Grade 8 piipils 

II B 12 items per pupil 27 Grade 5 pupils 

3 forms each -with (one class) 

6 Level I items and 

6 items from the Probable non-masters 

other levels 

II c' 30 items; 6 for each 77 Grade 11 pupils 

level; 6 additional (chemistry) 
items from level ill; 

2 test forms Probable meters 

All testing vas done with at least tv;o forms of the iri^t in which 
items were randomly ordered. Form 2 had the reverse item order 
from Form 1. 

Decision rules for inrprovement of Version II were more 
complex than for Version I. The scoring provided for a classifi- 
cation of a pupil's level of proportional reasoning* The assigned 
reasoning level \ras then used to categorize responses. It was 
possible then to note how the items discriminated between 
proportional reasoning levels, 

A pupil was assigned as a master of a part' '^ular level when 
he achieved correct responses for four of the six x. assumed to 
be written at that level. It was reasoned that with six items per 
level and four responses per item (Level E response always 



89 



77 

"I have no solution"), the probability of success by pure guessing 
v;ould be one-fourth per item. For six items, then, it was probable 
that two items might be answered correctly by pixre chance. 

Through test scoring, the masters and non-masters for each 
level were identified. Since all pupils were tested on all items, 
the scoring may be thought of as a classification scheme where 0 
denotes non-mastering and 1 denotes mastering at respective levels 
(see Figiare V) . A person mastering all levels would follow the 
sort of performance on the right. A person failing all levels 
would follow the performance on the left. 

This Version II scoring accomplished an assignment of each 
pupil to a performance index based upon his meeting or failing the 
criteria of achieving correct responses to four of the six items at 
each level. In Table 5.6 there is a listing of all possible per- 
formance indices arranged by the level they probably represent. 
The number of eighth grade pupils, masters in proportional 
reasoning, are listed by the performance index they achieved. As 
anticipated, most of the eleventh grade pupils, 78 per cent, 
achieved above Level 11^ These results suggested, however, that 
too many eighth grade pupils were being classified in Level 0 or 
Level I. 

The responses of grade 5 pupils, non-masters, were valuable 
in evaluating the Level I items. Grade 5 resiats. Version II B, 
were obtained by hand scoring. . The results, as shown in Table 5.7, 
suggested that Level I items were working appropriately. 

90 



Performajice ' Perfomance 

Index Failing index Passing 

Level litems 0 Fails Level I 1 Passes Level I 

Level II items 00 Fails Levels I and II 11 Passes Levels I and II 

Level III items 000 Fails Levels I, II and III 111 Passes Levels I, H and III 

Level IV items 0000 Fails all levels = Preoperational 1111 Passes all levels = Formal 

Stage- Level 0 II - Level IV 

Figure V. Perfomance Index 



92 



00 



ERIC 



79 



Table 5-6 



Performance of "Masters" and "Transitional" Pupils 
on Versions II A and II C 



Levei 


Performance 
Index^ 


Grade 8 
Pupils 
"T"ran<?i+;i onnl" 

N = 29 


- 

Grade 11 
Chemistry 
Pupils 

N = 75 




0000 


11 


1 




0001 


0 


0 




0010 


0 


1 


ijevex u 


vJUXJL 


0 


0 


(Preoperational) 


0100 


0 


0 




0101 


0 


0 




Olio 


0 


0 




0111 


0 


0 




1000 


10 


1 


Level I 


1001 


0 


0 




1011 


0 


0 




1010 


5 


8 


Level II 


1100 


0 


5 




1101 


0 


0 


Level III 


1110 


3 


36 


Level IV 


mi 


0 


23 



^ This notation describes the levels passed and failed, 
e.g. , 1111 means 

Passed Level I 
Passed Level II 
Passed Level III 
Passed Level IV 



93 




Table 5.7 
Version II B Results 



Responses i^evel I Items Level II Items Level III Items Lavel IV Items 

1 L Mlll^ 1 3 3 611 111 2810111718 1 9G2 17"19 

■I ■ — - ■ .1 ■ ■ I ■III " I I ■ N II I I I M ■ ■ H I 11 .III I » II li I I I ■ H i r - !■ Ill .PI ■ I 1^ 

A llji 3 110 2 105315 3 0 0 li 5 1 11 3 _ 1 

732228 213302 1 1 li 3 1 1 I ^ 21 



1| 2 215 1| 8 1 1| 2 0 l| 1 031720 02 11 
D 12 215 5 0 7 3 !i ^ 1 2 1 221303 2130 

E 2 6 M 9 1 0 1 6 3 0 0 133011 1116 



Correct answrs are underlined. 



81 

Items llC^ and lUC^ could have been too hard since they were 
ans\7ered correctly by fewer pupils. Results from other levels 
confirm that these items do discriminate. 

Toble 5.8 lists responses for all grade 8 pupils: grade 8 
Level 0 pupils (0000) and grade 8 Level I pupils (1000). 

Table 5*8 

Level I Item Results for Grade 8 Pupils on Version II A 



Per cent correct by 
student description 



Item number 


All 
N=29 


0000 
N=ll 


1000 
N=10 


Comment 




62 


36 


70 


okay 




62 


9 


90 


okay- 




69 


55 


70 


okay 




72 


27 


100 


okay 




kQ 


9 


60 


chango 


1^1 


69 


36 


90 


okay 



The first criterion for item improvement vas that items 
for Level I should be answered correctly by approximately 66 prr 
cent of the eighth grade pupils. Item 2C^ did not meet this 
criterion. 

Contrasting the results of Level 0 and Level I pupils 
gives some estimation of how well each item discriminated between 
masters and non-masters. Item HC^ was especiaUy good at dis- 
crimination, as shorn in Table, 5»8. Item 2C^ discriminated well 

96 



but should have been correctly ans>;ered by more persons. Item g^;^^^ 
it vas concluded, needed improvement. Very familiar objects w^3>/ 
substituted for the pictvires of the problem. Version II item 
decisions are summarized in Table 5.9* 

Table 5-9 
Version II Item Decisions 



Level I Items 

Test Item ikc^ IIC^ 
% Correct 

Responses 62 62 
N = 29 

Decision Use Use 



Level II Items 

Test Item ikc^ UCg 



% Correct 
Responses 52 
N = 29 



Decision 



59 



Use Use 



9C^ 
69 

Use 
62 



Change 
Exanrple 



kC^ 2C.^ IC^ Av^j.A^ 



72 



U8 



69 



^3 



Use Change Use 
Exanrple 



5C 



3Cr 



2 -^2 
7 38 



ICg Av^i-A^ 
59 



Change Use Reduce 
Ratio Only 2 Ambiguity 
Charts 



Level III Items 

Test Item IQf^ 1TFj_ 



% Correct 
Responses 21 
K=29 



52 



Decision Change Use 
Ratio 



llFj 
Use 



lOF^ 



U5 



55 



38 



Use Change Use 
Ratio 



^3 



Level rV Items 

Test Item 19F2 iTFg HFg 
^ Correct 

Responses 31 10 28 
N = 29 

Decision Use Replace Replace 

Item Item 



hF^ IF2 Av^^A^ 

2k 2k Ik $2 

Use Replace Use 
Item 



97 



Version II needed some infrprovement. Version II had the 
"beginnings of appropriate discrimination but items at each level 
needed changes • 

Version III A and Version III B 

Version III A was constructed from the experience in 
testing \rLth Version II. These decision nO.es were used: 

1, Items within a level shovtld have homogeneity in 
their overall difficulty. 

2. Items should discriminate between the responses 
of persons identified with levels of reasoning, 
that is. Level III pupils should have better 
performance on Level III iteins than Level U 
pupils. 

Selected items were randomly ordered through the test, Tvro 
versions of the test were used in all testing. One version had the 
reverse order of items from the other. The key and distractors for 
the items were randomly ordered. The population tested ^rilth 
Version III included all grade 8 picpils in one junior high school 
(see Figure VI). Thirty grade 5 pupils, one class at an 
elementary school, "v/ere tested with Version III B. Version III B 
differed from Version III A, since it included the lower thr<ie 
levels. 

Test deficiencies were evidenced by the very large number 
of pupils failing to meet success by the criteria for Level I and 
then showing success for higher levels. Of 227 pupils who failed 
to correctly answer four of the six Level I items, only 99 failed 
to meet the criteria at the other three higher levels. It was 



98 



20 
1111 

—29. 

Ill 30 

1110 

97 

ll" 1101 
_iiZ 

110 31 
1100 

166 1 



1 1011 

20 

101 19 

1010 

69 

10" 1 

1001 

100 US 

1000 

_393 8 

All Grade 0111 
8 pupils h3 

on _35 

0110 



88 



01 5 

Old 

010 0 

0100 

227 0 

D 0011 

22 



001 22 
0010 

139 

00 18 

0001 

117 

000 99 

0000 



Figure VI- Grade 8 Pupil Performance on Test Version III A 



99 

o 

ERIC 



85 



found that two of the six items for Level I had been incorrectly- 
keyed and that some program problem had not carried through the 
old classification. The items themselves mxe likely better than 
performance indicated. 

Test analysis followed ttie same pattern as explained for 
Version II • A summary of these improvements is provided in 
Table 5.10. 

Table 5.10 . 
Version III A Item Decisions 



Level I Items 

Test Item ikCj^ IIC^ 
^ Correct 

Responses 63 62 
N = 393 

Decision Use Change 
only 2 
examples 



9C, 
69 



Use 



he. 



72 



2C, 



Cheinge Use 
Make more 
discrimi- 
nating 



Level II Items 



Test Item 

61 



IIC2 
53 



% Correct 
Responses 

Decision Use Change 

Responses 

Level III Items 

Test Item ISf^ 17Fi 
i Correct ^ 
Responses 



6C2 
38 



5C2 

52 



Replace Change 

1 answer 



Decision 



Use Add plaus. 
answer 



Level rv Items 

Test Item 19F2 15F2 

io Correct 
Responses 

Decision Use Change 

order of 
answers 



llFi. 

Change 
1 answer 

IOF2 
62 



lOFi 
Use 

9G2 
21 



3C2 
60 
Use 

8F1 
27 



ICn 



68 61^ 



Use 



IC. 

69 
Use 

2F] 
h3 



Change Use 
Fbm stem 



5F2 . IF2 
3^ 21 



Remove 
words 
frm ratio 



Use 



Use 



Average 
66 



Average 
56 



Average 
hi 



Average 
3h 



Add 

more 

numbers 



100 



86 

Version IV A 

Version IV A was prepared from analysis of Version III 
results as previously described* Version IV A had thirty items. 
Twenty-four of these were the six items for each of Levels I, 11, 
III and IV. An additional six items at Level III were included to 
provide iuiprovement of Level HI. Test Version IV A was taken by 
272 pupils. Of these pupils, 77 were those randomly selected from 
385 grade 8 pupils at Olson Junior High, Bloomingfcon; I95 of these 
pupils were those eighth grade pupils taking science in the second 
semester at Portland Junior High, Bloomington. 

Version IV B had thirty items. The twenty-four items pro- 
viding the core test of six items for each of the Levels I, II, HI 
and IV were the same as those of Version IV A. The additional six 
items, however, were from Level HV to support^ improvement of Level 
XV items. Test Version IV B was taken by 69 pupils who were 
physics pupils at Lincoln High School, Bloomingfcon. By mat\irity 
and ability these pupils were assumed to be masters of proportional 
reasoning. 

It was intended that this testing be used to improve the 
items selected for test Version V. In addition to previous item 
selection techniques, the point biserial measure of item discrimi- 
nation was calculated. Decision rules for item improvement were: 

1. Items within a level should have homogeneity in 
their overa3JL difficulty as evidenced in: 

a. the total percentage of persons correctly 
answering the item 

101 



87 

b. the percentage of persons attaining the level 
who correctly answer the item 

c. the number getting the item right and the 
nmber getting the item wrong 

2. Items within a level should discriminate between 
responses of persons mastering that level and those 
not mastering the level as evidenced in: 

a. pupils coded as masters of the level should 
have performance on items of that level that 
distinctly exceeds that of non-masters 

b. the average scores over the test of those who 
are masters of the level should be approxi- 
mately the same 

c. r biserial values for each item should 
approximate or exceed .5000 

Version IV A results are described in Figure VII. Of the 

272 pupils tested^ 232 or 85 per cent were identified distinctively 

with a certain level. Table 5-11 summarizes the proportional 

reasoning levels assigned. 

Table 5-11 



Proportional Peasoning Levels of Grade 8 Pupils on Version IV A 





Number 


Level 


stage 


Per cent 




35 


0 




13 




26 




Transitional 


9 




62 


I 


Concrete I 


23 




12 




Transitional 


h 




76 


II 


Concrete II 


28 




2 




Transitional 


1 




55 


III 


Formal I 


20 






IV 


Formal II 


1 


Total 


272 









102 



88 









il 










• 












m 


55 








1110 












11 




2 ' 








1101 






78 








110 


76 








1100 


211 








1 






0 








XUJLJL 






7 








TOT 


7 








1010 












10 




5 ./ 
✓ — 








1001 






67 








ion 


62 








1000 


P7P 
c. { c 


• 




< 








0 








01 11 






h 








Oil 


L 








01 10 




22 








m 




0 








oirii 






18 








010 


18 








03.00 


61 














0 








0011 














OOT 


L. 








0010 




39 








00 




0 








0001 






35 








000 


35 








"oooo 



Figure VII • Pupil Performance on Test Version IV A 



103 



89 

Grade eight responses by items are described in Table 5.12. 

Table 5.12 



Version IV A Item Decisions 



Level I Items 














- 


Test Item 


1^1 


llCi 


9Ci 




2Ci 


ICi 


Average 


io Correct 
Responses 
N = 272 


63 


71 


69 


62 


66 


55 




Decision 


Use 


V4 M 


Use 


Use 


Add 
table 


More 

diagram 

detail 




Level II Items 
















Test Item 


1^2 


IIC2 


IOC2 


5C2 


3C2 


2C2 


Average 


% Correct 
Responses 
N = 272 


77 


51 


68 


68 


60 


69 


65 


Decision 


Use 


use 


Use 


Replace 


Use. 


TTco 

use 




Level HI Items ■ 














Test Item 


I8F1 


ITFi 


llFi 


lOF-L 


8F-L 


2Fi 


Average 


% Correct 
Responses 
N = 272 


58 




65 


1^8 


37 






Decision 


Use 


Use 


Eeplace Sinrplify Use 
ratios 


Use 




Level IV Items 
















Test Item 


I9P2 


.X5E2 


lOFg 


9G2 


5F2 




Average 


% Correct 
Responses 
N = 272 


21 


18 


38 


19 




29 


27 


Decision 


Use 


Use 


Use 


use 


Use 


Use 





104 



It was apparent, that Level I items were too difficult and 
Level II items too easy. Item discrimination information from 
Table 5.13 was used as indicated. 

Version ly B 

Test Version IV B consisted of thirty items. The twenty-- 
four items forming the core of the test were identical to those of 
test Version IV A. The additional six items, however, were from 
Level IV to allow improvement of Level IV items. Test items were 
randomly ordered in the test. The test was administered in two 
forms. One form had the reverse order of the other form. 

Test Version IV B was taken "by sixty-nine physics pupils 
at the same time as test Version V A was being adciinistered. 
Results from Version IV B were not available for iirrprovement of 
Version V A. Pupil performance on Version IV B is summarized in 
Figure VIII. 

Decision rules for improvement of the items of Version 

IV B included information from calcxalation of the point biserial 

measvire of item discrimination. The decision rules were: 

1, Items within a level should have homogeneity in 
their overall difficxalty as evidenced in: 

a. the total percentage of persons correctly 
answering the item 

b. the percentage of persons attaining the 
level who correctly answer the item 

c. the number getting the item wrong 



105 



Table 5.13 
Item Discrimination Version T/ A 



I Getting # Getting 4verage Score on Point 





Item 


Item 


This Level 


Biserial 


T 








Corrects 


Wrongs 


Correlation 


Value 


1-1 . ,. 


197 


75 


82.7 


l|8.9 


,618* 


12.91 


1-2 


21^7 


25 


77.7 


31.3 


.5'i'7* 


10,72 


1-3 


217 


55 


78.8 


52.1 


.430* 


8,00 


i4 


203 


69 


81.9 


l[8.3 

Tv# J 


.598* 


■1 A All 

12,24 


1-5 


l61f 


108 


81I.9 


56.0 


.576* 


11,58 


1-6 


170 ■ 


■ 102 


86.1 


52.3 


,668* 


l^^.75 


Level I Average 


199.7 


72.3 


82.0 


l|8.2 






2-1 


198 


7^ 


•71.6 


llO.l 


.563* 


11.20 


2-2 


166 


106, 


7^.7. 




,590* 


11.99 


• 2-3 


193 


79 


72.3 


ki,k 


,580* 


11.70 




202 


70 


70.3 


I13.1 


.if91* 


9.27 


2-5 


155 


117 ., 


71.1 


53.0 


.370* 




,2-6 


119 


153 


77.6 


52.2 


.521* 


10,02 


Level II Average 


172.2 


99.8 


72.9 


ii6,o 







* significant at the .001 level 



106 

ERIC 



TaWe 5,13 (continued) 
.a Discrimination Version IV 



I Getting | Getting Average Score on , Point 



Question 


Item 
Correct 


Item 
Wong 


This Level < 
Corrects Wrongs 


Biserial 
Correlation 


T 

Value 


3-1 


131 


Ikl' 


PfiO 




.6l2# 


12.70 


3-2 


96 


176 


1l 




. ,52'i* 


lO.U 


3-3 


155 


117 






,581* ■ 


11.7!^ 


3-1^ . 


175 ■ 


97 




?ll ll 


.593* • 


12.09 


3-5 


52 


220 




42.0 


.075** 




3-6 


91 


■ 181 


59^7 


3i1.it 


.513* 


9.82 


level III Average 


129.6 


ll|2.l| 


56.9 


33.6 








80 


192 


39.0 


17.9 


.510* 


9.73 


I|-2 


75 


197 


I1O.7 


17.8 


' .5^3* 


10.62 


^-3 


83 


189 


39.0 


'17.5 


.523* 


10.08 




Id 


22lf 


39.6 


20,8 


.381* 


6.77 




70 


202 


36.9 


19.6 


.ta# 


7.18 




37 


235 


37.8 


21.9 


.290* 




Level IV Average 


65.5 


206,5 


38.8 


19.3 







* Significant at the ,001 level 
#* Significant at the ,1 level 



Table 5.13 (continued) 



Item Discrimination Version IV A 



f Getting f Getting Average Sco// Point 
Item Item This Ley/ Bis^^i^l 

jn Correct Wrong Corrects mif^j^J^^i^. 



5-1 


153 


119 


3T.7 


5-2 


^5 


227 


30.1^ 


5-3 


52 


220 


50.3 


54 


29 


2lf3 


^8.9 


5-5 




175 


I16.2 


5-6 


56 


216 


50.3 


tevelV Average 


129,6 


ll|2 


56.9 



* Significant at the ,001 level 
Significant at the ,1 level 



110 



6? 



100 



FlfTcre VIH* Popil Performance on Test Version IV B 

112 



ERIC 



9^ 



28 
1111 

57 

111 29 

mo 

11 h 

1101 



no 2 

1100. 

3 

1011 



101 JO 

^ 1010 

1001 



1000 



. 69 _ 



on 0 

ono 

1 

01 0 

Old 



010 1 

moo 



oon 



001 0 

0010 

1 

0 

0001 



000 1 

0000 



95 



2* Items vTxthin a level shoiild discriminate betveen 
responses of persons mastering that level and those 
not mastering the level as evidenced in: 

a, pupils coded as masters of a level should 
have performance on items of that level 
that clearly exceeds that of non-masters 

b» the average scores over the test of those 
who are masters of a level should "be 
approximately the same 

c. point biserial values for each item should 
approximate ,500 or better 

That physics pupils were indeed masters was confirmed by 

their performance as summarized in Table 5*1^» 

Table 3-^^ 



Version IV B Item Responses of Physics Pupils 



Level I Iteas 
















Test Item 


llfCi 


UCi 




l+C-L 




ICi 


Average 


'fo Correct 
Responses 
N = 69 


91 


9h 


96 


91 


93 


91 


93 


Level II Items 
















Test Item 


lltCg 


IIC2 


1OC2 


5C2 


3C2 


IC2 


Average 


^ Correct 
Responses 
N = 69 


93 


86 


91 


81 


86 


8U 


87 


Level HI Items 
















Test Item 


18P1 


17Pi 




lOPi 


8Pi 




Average 


'fo Correct 
Responses 
N = 69 


81+ 


70 


87 


81+ 


62 


90 


80 


Level IV Items 
















Test Item 
^ Correct 
Responses 
N = 69 


19F2 
52 


15P2 
7h 


IOP2 
57 


9G2 
71+ 


5F2 


35 


Average 
58 



113 



96 

Item discrimination information sxmmarized in Table 5*13 
and the information from Table 5»1^ supported the replacement of 
item iFg in Version V B. 

Version V A 

Test Version V A contained thirty items. Twenty-four items 
vere the core of the test. Each of the four proportional reasoning 
levels had six test items froiu this set of twenty- I'oiu-. The 
additional six items were from Level IV to support improvement of 
Level IV items from pupil performance on this test and the per- 
formance of masters on test Version IV B. 

Items were rsaidomly ordered in the 'test. The test was 
administered in two forms. One form had the reverse order of the ' 
other form. 

Test Version V A was administered to h27 grade eight pupils 
at Oak Grove Junior High School. Included wre most of the 
original forty pupils who participated in task testing. Pupil 
performance on test Version V A is summarized in Figure IX. 

Iniprovements of this version were possible through the 

rescoring of Level IV items. Decision riiles for such improvements 

included information from calculation of the point biserial 

measure of item discrimination. The decision rules were: 

1. Items ' Within a level should have homogeneity 
in their overall difficulty as evidenced in: 

a. the toteil percentage of persons correctly 
answering the item 



114 



270 



35 



157 



23 



1111 



__90 

111 67 
1110 

160 

U"* 8 

13 01 

70 

110 62 
1100 

9 

1011 

38 



101 29 

1010 

no 

10 1 

1001 

72 

100 71 
1000 

Grade tJ 0 

Pupils 0111 

13 



Oil 13 

0110 



01 1 

0101 

22 

010 21 
0100 

1 

0011 

19 



001 18 
0010 

122 

00 k 

0001 

103 

000 99 
0000 



Figure IX. Pupil Performance on Test Version V A 



115 



98 

b. the percentage of persons attaining the . 
level vrho correctly answer the item 

c, the nuDiber getting tho item vrong 

2. Items Kithin a level shoux discrirulnate between 
responses of persons mastering that level and 
those not mastering the level as evidenced in: 

a. pupils coded as masters of a level shoxild 
have perf ormEice on items of that level 
that clearly exceeds that of non~masters . 

b. the average osc ores over the test of those 
who are masfers of a level should be 
approximately the same 

c. point biserial values for each item 
should approximate .500 or better 

Seventy-five per cent (322) of the kZJ total grade eight 

pupils were clearly identified with a proportional reasoning level. 

summarizing Figure IX resxilts, the proportional reetsoning levels 

assigned were those of Table 5«15* 

Table 5*15 

Proportional Reasoning Levels of Grade 8 Pupils on Version V A 



Number Level Stage Per cent 



99 


0 


Preoperational 


23 


58 




Transitional 


Ik 


71 


I 


Concrete I 


17 


39 




Transitional 


9 


62 


II 


Concrete II 


15 


8 




Trans it ioneil 


2 


67 


III 


Formal I 


16 


23 


IV 


Formal II 


5 



116 



99 

i'upl.l responses by .ire suiranarizcd in Table 5.16, 



Table 5.I6 

Version V A Item Responses of Grade 8 Oak Grove Pupils 



Level I Items 
















Test Item 


lUc^ 


IIC-L 


9Ci 


^1 


on 

2Ci 


ICi 


Average 


^ Correct 
Responses 
N = 1^27 


68 


71 


72 


59 


61+ 


57 


65 


Level II Items 
















Test Item 


li*C2 


IIC2 


10C2 


5C2 


3C2 


IC2 


Average 


% Correct 
Responses 
N 1+27 


67 


55 


69 


35 


50 


53 


55 


Level HI Items 
















Test Item 


18Fi 


ITPi 


iiPi 


lOFi 


8Pl 


2Pi 


Average 


% Correct 
Responses 
N = 1|27 


1+6 


3h 


55 ■ 


57 


39 


59 


>8 


Level IV Items 
















Test Item 


19F2 


15P2 


10P2 


9C32 


5P2 




Average 


% Correct 
Responses 
N = U27 


33 


37 


1+5 


16 


25 


26 


30 



It Tsas apparent thefc changes from Version IV A were 
improvements with 'the exception of the replacement of item 5C2* 
These resvilts s\xggested that items 9G2 5F2 needed improvement. 
Results from Version IV B, physics masters, supported the change 
of item 5P2» Results on 9G2 hy masters vas commendable suggesting 

117 



100 

that this item \^as likely a higher order proportional reasoning 
level. The item discrimination information of Table 5. 15 con- 
firmed the need for replacement of items 9G2 and 5F2 and suggested 
that appropriate, replacement items v/ould "be items I2F2 and 2F2* 

Version V B 

Test Version V B was obtained by a reworking of the V A 
results; Items and 5F2 were replaced with it^s I2F2 aad 2F2» 
The results for these items were appropriately assigned and the 
overall test results recalculated. Pupil perfonnance on this, the 
final test version, is summarized in Figure X* Seventy-fovir per 
cent (317) of the k27 total pupils vrere clearly identified with a 
proportional, reasoning level. Table 5.17 summarizes the Figure X 
results in terms of percentages of pupils attaining each 
proportional reasoning level. 

Table 5*1? 



Proportional Reasoning Levels of Grade 8 Pupils on Version V B 





Number 


Level ■ 


stage 


Per cent 




98 


0 


Preoperational 


23 




58 




Transitional 


ih 




67 


I 


Concrete I 


16 








Transitional 


10 




60 


II 


Concrete II 


Ik 




10 




Treinsitional 


2 




60 


III 


Formal I 


Ik 






IV 


Foimal II 


7 


Total 


k27 









118 



101 



32 



271 



3h 



156 



12 



1111 



92 

.111 60 
1110 

162 10 



11 X..01 

70 

110 60 
1100 

9 

1011 

37 



101 28 
1010 

109 

10 5 

1001 

72 

100 67 
1000 



^^27 

Grade H 0 

Pupils 0111 



Oil 12 

0110 



01 2 

0101 

22 

010 20 
0100 

3 

i- 0011 

19 

001 16 

0010 

122 

00 5 

0001 

103 

000 98 
0000 



Figure X. Pupil Performance on Test Version V B 



102 

Table 5.l8 presents pupil responses by item for Version 
V B. The replaxiement of the two Level IV items did improve the 
test. 



Table. 5*l8 

Version V B Responses of Grade 8 Oak Grove Pupils 



Level I Items 
















Test Item 


l^Ci 


nci 






2Ci 


ICi 


Average 


% Correct 
Eesponses 
N = te? 


68 


71 


72 


59 


61+ 


57 


65 


Level H Items 
















Test Item 


llfC2 


rLC2 


1OC2 


5C2 


3C2 




Average 


^ Correct 
Responses 

TIT _ JlOT 


66 


55 


69 


35 


50 


53 


55 


Level III Items 
















Test Item 


18Fi 


17Pi 


rLFi 


lOP-L 


8Pi 


2Pi 


Average 


fa Correct 
Responses 

ir = U27 


1+6 


?>h 


55 


57 


39 


59 


U8 


Level IV Items 
















Test Item 


19P2 


15P2 


IOP2 


12P2 


2P2 


IP2 


Average 


^ Correct 
Responses 
N = l^27 


33 


37 


l^5 


28 


33 


26 





Table 5»19 presents data vhich confirm the homogeneity of 
items by level and relates the discrimination these items have» 
There is consistency between the nmber getting the items correct 
and wong by level. The average scores on the items of those who 



120 



Table 5.19 
Version V B Item Discrimination 



I 
I 



Question 


lesTi 
Item 
Number 


Item 
Correct 


if uQiiiing 
Item 
Wrong 


Average Score on 

This Level 
Corrects Krongs 


Biseriel 
Correlation* 


T 

Value 


1-1 


1 


289 


138 


. 76.1 


,/ kl,%. .. 


-.598. 


15.38' 


1-2 


5 


302 


125 


7^.0 


4I.9 


.558 


13.85 


■1-3 


20 


306 


121 


7^.6 


ii0.9 


.568 ■ 


llf,22 


l4 


15 


253 


Yjh 


77.9 




.579 


llt.66 


1-5 




273 


15^^ 


73.9 


i^9.^. 




10.12 


1-6 


23 


21^3 


l8lf 


80.1 


' 1^5.1 


.6^9 


17.57 


Level I Average 




278 


1I+9 


76.2 


kk,^ 


.565 


iif.30 


2-1 


21 


281f 


llf3 


ft. 9 


%l 


.561 


13.96 


2-2 


12 


236 


191 


67.2 


39.0 


.5^5 


13.ifO 


2-3 


18 


293 


13I1 


65.1 


31.6 ■ 


.605 


V)A 




Ik 


lli8 


279 


72.0 


k).k 


.1^91 


11.62 


2-5 


8 


213 


21k 


■ 67.6 


kl,l 




12.02 


2-6 


2 


225 


202 


66.7 


ku 


.k^V 


11.73 


Level II Average 




233 


■ ISk 


67.2 


38.9, 


.533 


13.06 



* ai blserial correlations aie significant at the ,001 level • 



121 



Table 5.19 (continued) 
Version V B Item Discrimination 



Test f Getting f Getting Average Score on nt 

Iteia Item , Item This Level ' dal T 



Question 


Number 


Correct 


Wrong 


Corrects 


Wrongs 


S'^f ation* 


Value 




7 


198 


229 


65.lt 




.586 


l!f.92 






lli6 


281 


65.6 


^9.7 


.li6i 


10i71 


J J - 


17 
-"•1 




19^ 


61.5 


32.9 


.535 


13,0'+ 




11 


2l^5 


182 


61.9 


30.7 


.579 


II65 


3-5 


13 


168 


259 


66.1 


37.3 


.528 


12.82 


3-6 


3 




173 


61,2 


30.1 






level III Average 




208 


219 


63.6 






I3.te 




16 


139 


288 


klA 


21.8 


. .559 


13.90 




19 


157 


270, 


U: 


, 21.7 


.515 


12.38 


^-3 


\ 


192 


235 


1|2,1 


^20.1^ 


.505 


12.07 




10 


120 


307 


39.6 


17.5 


.1^88 


11.52 




22 


Ikl 


286 


39.^ 


16.0 


.51*0 


13.23 


' ■ . 1^.6 


6 


109 


318 


If7.6 




.1^76 


. 11.17 


L?velIV Average 




lii3 


281f 




20.3 


.513 ■ 


12.38 



* All biserial correletions are si(?iificant at the .001 level 



thr tern correct and those got it wong ara similax. ^t;^. 
rt .:s-«T^i- ^n-hinri J as measured by point biserial correlation 
:Qefftl: . ant, does consistently aprr-roximate .500. T-value suggests 
t at >i correlation values are not due to chance- 

SuTTTTnary 

laper-pencil items vere improved through the changes 
Ir-iCJaHly based on test results of non-master pupils, 
-:^asj.tuz;nal pupils and master pupils. 

Performance of comparable pupils on the five versions is 
rf.v; in Table 5-20. The items, which are reported under Versitm I, 
l:!'^ those 2'+ of the j6 that wre used in Version II. Increased 
i-uem hamogenedty is evident in the decreausing range of percentage 
correct. Higher average values in most levels were also achieved 
in ^he later versions. 



Table 5-20 

?ercenta^ Correct on Test Versions by Grade 8 Pvcpils 





Level I 


Level II 


Level HI 


Level IV 


Version 


Range Avg. 


Range Avg. 


Range Avg. 


Range Avg. 


I (2J+ items 
only) 


J+3-63 3h 


25-68 kz 


25-55 39 


10-31 22 


TT 


kQ-72 63 


7-62 J+6 


21-55 h3 


10-31 22 


in 


62-72 66 


38-69 56 


27-68 J+7 


21-62 3J+ 


IV 


55-71 65 


51-77 55 


3'+-58 1^ 


18-38 3J+ 



57-72 65 50-66 55 3J+-59 J+8 26-37 3*+ 



125 



CHAPTER £ 
CHARACTERISTICS OP TI 2 l^^TRUMETE!. 



In this chapter, criteria for V'^?^ Idity, rsliability and 
discrimination of the instrument are 'S^av^. The statistical 
analysis af the instrument is describei ^ani jucjgmeni^s are made 
regarding the instrument's performance ^itSi respect; to the stated . 
criteria. 

Validity 

Content Validity 

Validity of a test is a measure of the degree to which the 
test measures what it is intended to measure. One component of 
validity is content validity. In accord vdth Cronbach (I96O), a 
test has content validity if the items in the test require behaviors 
for their resolution that are proper to the trait being measured. 
The purpose of this test vrets to measure four levels of proportional 
reasoning. Items were written for each of tots tour levels. Each 
item used, as the question stem, a situation that had been xised in 
task testing or had appeared in the literan^LTH.. Specifications 
for writing the responses were that the key, correct answer, would 
be a response at the level tested and the disxractors would be 
plausible for lower levels of reasoning, 

106 

126 



107 

"^his „ „ 3ical ::-3l£iionship of item at^ n to. theory is 
demonstrated the following examples ( see- : Irr^ixes XI, XH, IDIII, 
and XIV) of ic-a desirp x-aicen from the tes-::'£ ::'*inal version. The 
test had strong contezi v^iiiity hecaoise tne itijnis in each level 
met the specifications for Proportional reasoning of Piaget and 
Inhelder (:i958) . 

ConciH-rent Validity 

Concurrent validity, as defined by Cronhach (I96O), exists 
vhen the test correlates highly posiiiively '^clt:. direct test 
measures of the same trait as the initial test. Concurrent 
validity of the paper-^penciJ- test Vas assumed to be acceptable vhen 
the pupil paper-pencilL test scores siowed a ^rasxtive correlation 
of at least .30 with their corresponding tssk interview scores. 
The criterion value of .30 was based on the range of reported 
inter-task correlations -.15 to .55 (LawsoUj Nordland and DeVito, 
1975) • Table 6.1 summBrizes the correlations for thirty-five 
pupils "wiLO were measurei: with both tasks and the paper-pencflilL test. 

Tasks 1, 2 and 3 are 3 respectively, the shadow tsE:.. Mr. 
Tall taiHi and the sled icasK* Rate 4, Rate 8^ and Rate iS nre three 
ratirrr s^chemss xised to evaluate paper-pencil results. Under Rate k 
every ptEd.1 va.s assigned to one of foxor proportional res:naning 
levels, namely I, II, JZI or iv, with ^ trsrs^tional stsHges. 
Under Rate 8 transitions! stages were rdentrfied, namely 0, 1.0, 
1.5, 2.0, 2.5, 3.0, 3.5, and l+,0. Under Rate 16 the values them- 



127 



108 



Item Design aoncrste I Stage (Level l) 

Swaf:Vwi Score Criteria 



Key 



Dxstractor 



Dxs tractor 



Z^icrete I 



Distractor Z-tisasoned 



Reasoned 
Giiess 

Illogical 
' Guess 



Distractor None 



k Subject corapensates in a qualita- 
tive way. May match two direct 
ordered relations or use addition 
or subtraction to contrast or 
calculate ratios 

A < B < C < D 
• • • • 

J > K > L > M 

3 Subject makes erroneous connection 
but one which involves appropriate 
elements 

2 Subject 'makes reverse ordered con- 
nection but involves elements 

1 Subject guesses or inakes nc orcsred 
connection - nonsensical 

0 Subject makes no response 



Item Esample 

Msry buys three tickets to a raffle where 90 tickets are sold. 
Jsn:e buys one ticket to a raffle where 30 tickets are sold. Su^ 
buys three/tickets to a raffle vjhere 300 tickets are sold. 



\^rich 



^ have about the. same chance- of vrLnning? 



Answer 



jB3ie flT^ Mary because three chances in 90 
is "the same as one in 30 

S2ie and Msry because each have three tickets 
Jans and iH^zy because theirs sire the least 

1^'"^ girls liave the same chance 
I teTO no srrswer 



Stage 
Concrete I 

Reasoned Gu2^3:a 
Reasoned. Guass 

l3-logical Guess 
None 



Figure XI. Lesl I Item Design and Example: Test Item 1 



128 



109 



Item Design Concrete II Stage (Level II) 



EKLC 



Key 



aoi::cr.ete II 



Score Criteria 



Subject orders corresponding 
relatl'^ns (vitb inverse) 
A <^ B < C < D 



K 



L > M 



Distract-or Ccn.crete I 



Distractor Reasoned 2 
2aess 

Distractor iXLogical 1 
.G^uess 

Distracxrnr SDne 0 



Subject conipensates in some 
qualinrative. non-ordered vay 
( or d±rect - not inverse ) 

Subject makes erroneous connection 
"but one T^hich involves elements 

Subject guesses or makes no con- 
nection "between how things change 

Subject makes no response 



Item Erairole 

Four G^rs have different speeds;:: Car A is i^he fastest, Csr B the 
n:^xt ^iiiStest, Car C the next fastesi: and Car D the next fastest. 
Ihe f£S::est car takes the least: timsr i:o go 200 miles, the next 
Pastes-: ear the next least ^Ime and so on. Which car is the . third 
fastest, a^td takes the third least time tx3 go 200 miles? 



Answer 



A. 



C. 

B. 



Isl fiiSiKst; 2nd fastissrt 

r:^- Car B 

.1st least itime 2nd least time 



Cax G because: 
Isrb mos:t fast 

Car A 
Is-t most time 



2nd most fast 

Car B 
2nd most time 



3rd fa^est 

.Car C 
3rd least time 

3rd most fast 

Car C 
3Ed most time 



ITo car because they don^t match ^xp 



Car B because: 
1 - Car D 

I have no answer 



2 - Car C 



- Gar B 



•^/-' .tiicrr ete II 



■::^>Tni7rrrete I 



Seasoned Guess 
Tiiogical. Guess 

Bone 



Figure XII. iBvel H Item Design 'F^i Exsnple: Tes± Item 12 



129 



no 



Item Design Formal I St3,ge (Level in) 

Stage Score Criteria 

Formal I h 



Key 



Distractcr Concrete TL 

Distraclzsr Concrete I 

Distracrb or Gues s 

Distractar None 



Siibject multiplies, uses .simple 
ratios., contrasts ratios and can . 
order them 5/25 2/25 
5/25 X 10= 2 

A rulej usually addition or siib- 
traction, is used to contrast or 
calculate ratios 

Subject compensates in some 
qmlitaiive way 

Sul^sect guesses or makes no con- 
necirioii, l3etwe en how things change 

Subject CDe£ not respond 



Item Exa-ipXe 

Jane is weS^ain^g oixz axrslass on this s^^srmaiScet 
scale* liS^at win fcirtseri apnles w^i:^^: ±£ six 
apples Tvsigi" 2. Tpounrxs? 



C. ^4- 2/3 jSDS^hecaus 2:16 x 1^4.'= i|- 2/3 
B • 3 gt: U :ibs • becszLse it is more 

A» 1^ shs.. because 6 -^ f- - JLV 

z 5 = 10 

D . 5 bsnauHe 2 -f- ;2 -f- 5 
E» I -have no answer 




Stage 

Formal I 
Concrete II 
Concrete I 

Guess 
None 



Figure XEII • Le^el HI T3?m Design •and.SIxairrple : Test Iteju 2k 



110 

o 

ERIC 



Ill 



Item Design Formal II Stage (Level IV) 

Score Criteria 



Key 



Distractor 
Distract or 

Distractor 
Distractor 



Stage 
Formal II 



Formal I 
Concrete II 

Concrete I 
None 



k Subject calcvilates using pro- 
portions and recognizes the appro- 
priate proportion to be used. 

A ^ C A ^ C ^ E 

B D B D F 

3 . Subject multiplies or uses simple 
ratios 

2 A r\ile, usually addition or siib- 
traction, is used to calculate the 
increase or decrease 

1 Subject compensates in some 
qualitative vay 

0 Subject guesses or makes no con- 
nection between how things change 



Item Example 

On the ramp illustrated, the cart and its vreight are balanced by 
weights on the string. V/hat amount of weigjht is needed to balance 
UOO g of cart weight at 20°? 

Weight 



Angle 


Cart • 


string 


10° 


200g 


35 


10° 


300g 


52 


20° 


300g 


100 


20° 


l^OOg 




Answer 








D. 133 because ^ = 
100 



A. 133 because 



100 133 

500 .. 

X kOO = 133 



300 



C. 177 because it goes up 17 for every 100 
B. 150 because it is more 
E. I have no answer 



Stage 
Formal II 

Formal I 

Concrete II 
Concrete I 
None 



Figure XIV. Level IV Item Design and Example: Test Item 16 

131 



112 



Table 6.1 

Pearson Correlation Coefficients for 
Tasks and Paper-Pencil Ratings 
N=33 





Task 1 


Task 2 


Task 3 


Task Av. 


Rate \ 


Rate 8 


Task 1 














Task d. 


• 59 

S=,001* 












J. GUOXV J 


S=.0l8 


.27 

S=.062 










Task Av 


.83 

S=.001 


.77 

S=.001 


.73 

s=.ooi 








Rate 1+ 


S=.011 


.31 
s=.oJ+ 


.25 

S=.079 


S=.009 






Rate 8 


.36 

S=.020 


.29 

S=.052 


.2J+ 
S=.085 


.38 

S=.015 


.99 
S=.001 




Rate l6 


.35 

S=.023 


.28 

S=.058 


.23 

s=.096 


.36 

S=.019 


.98 

s=.ooi 


1.00 
S=.001 



* S is significance level 

selves were used and ordered in this manner: 

0000; 1000, 0010, 0001, 0011; 1100, 0101, 1001, 0100; 
. 1110, 0110, OUl, 1010; nil, 1101, 1011 

See Chapter 5 for a coicplete description of these ratings. 

Correlations exceeding the .30 level were reported for 
Task 1 with all ratings, for Task 2 with Rate 1+, for Task 3 with 
no ratings, for the task average with all ratings. 

The test was assiamed to have acceptable concurrent 
validity since the paper-pencil results reported as Rate 8 
(reasoning levels and transition scores) had a Pearson correlation 



132 



113 

coefficient of .38 \d.th the average task score which exceeded the 
minimvini .30 level and was significant at the ,015 level. 

Construct Validity 

According to Cronbach (I97I), a test has construct validity 
if it measures the attribute it is said to measure. It follows 
then that if the test does not measure other things, it is 
acceptable. Comparison of pupil test performance was made with 
pupil task scores and. with pupil intelligence scores measured with 
the Lorge-Thorndike verbal, nonverbal and total test. 

The test had groups of questions for each of the successive- 
ly more difficult levels. The observed pupil difficulty levels 
between groups of questions were compared. 

It \^s assumed that construct validity would be evident in 
the convergence of scores of other measures of the same test. 
Correlations between task scores and the paper-pencil scores would 
be high, positive and higher than task score correlations with 
intelligence test scores. 

The Pearson correlations using the scores of the thirty-fire 
pupils participating in both task and paper-pencil testing were .36 
between average task score and paper-pencil test rating, .53 
between task scores and Lorge-Thorndike nonverbal IQ and .35 
between task scores and Lorge-Thorndike verbal IQ. Although the 
correlation between task and paper-pencil scores was positive and 
high, it was exceeded by the value for task and nonverbal IQ 



133 



carrelatxDn* It imist be mentioned that the correlation between 
paper-poicil' scores and Lorge-Thorndike nonverbal IQ was .58 and 
betveen ;Baper -pencil scores and Lorge-Thorndike verbal IQ was .30.^ 
It is suspected that the high correlation with Lorge-Thorndike 
xxmveroaX is from some relationship with what is being measured 
anni also l^rom the continuous data provided by Lorge-Thorndike 
.x-,i:ores- 

Additionally, it is a construct of Inhelder and Piaget 
(1^958) that successive levels of proportional reasoning require 
:5zrogressively more sophisticated reasoning. Similarly, construct 
^J^l ost ion suggests that the difficulty level of items wotild be 
xssgscted to show an increasing difficulty with higher levels of 
the: test* This is illustir-ated in Figure XV. 



70 

i 60 

o 
o 

XQ 
-P 
Hi 

Q) -J 

a 20 



50 

ko 



6% 



55i 



h9i 



II III 
Test' LevelB 



30^ 



rv 



rigure XV. Average Per Cent Success of h27 Eighth Grade Pupils 
at the Foxxr Test Levels 



134 



Fiirther sirppoii; for this difficulty construct vrcis obtained 
by comparing the expected diff ic\aty rank of items by group and 
the observed difficulty rank. It was expected that in each level 
all items would have identical ranking, that is for 
every item in Level I. The following array in Table 6.2 resialted. 

Table 6.2 

Comparison of Observed and Expected Item Difficulties 

(# Right). 



Level I 



Level II 



Level III 



Level IV 



Test 


Expected 


Observed. 


Item 


Rank 


Rank 


1 


3.5 


k 


2 


3.5 


2 


3 


3.5 


1 


k 


3.5 


7 


5 


3.5 


6 


6 . 


3.5 


8 


7 


9.5 


5* 


8 


9.5 


11 


9 


9.5 


3* 


10 


9.5 




11 


9.5 


It 


12 


9.5 


13 


13 


15.5 


15 


■ ih 


15.5 


20 


15 


15.5 


12 


16 


15.5 


io»«- 


17 


15.5 


17 


18 


15.5 


9* 


19 


21.5 


22 


20 


21.5 


18 


21 


21.5 


. 16 


22 


21.5 


23 


23 


21.5 


21 


2k . 


21.5 


2h 



^ Items of evident discrepancy in rank order. 



116 

A measure of the continuity of this type of order is the 
Spearman rank correlation coefficient (Glass and Stanley, 1970) 
which for this array has a value of .87. OThis value suggests good 
construct validity in terms of difficulty rankings. 

Discriminant Validity 

A test has discriminant validity if it discriminates 
"between the trait it measxires and other traits. Evidence of 
discriminant validity was e^qpected in smaller correlations of 
paper-pencil proportional reasoning scores with notebook averages 
than correlation of paper-pencil proportional reasoning scores with 
teacher-test scores. This should he evidenced also in smaller 
correlcJbions of paper-pencil proportional reasoning scores with 
verbal IQ scores than with nonverbeuL IQ scores. 

Pearson correlation coefficients with test rating (0, 1, 
1^5, 2.0, 2.5, 3, 3.5, were for small group average, .k2; class 
test average, .6O; notebook average, .22; verhal intelligence, .58; 
nonverbal intelligence, .6^. These were all statistically signifi- 
cant at the .OOi level. 

Convergent Validity 

A test has convergent validity if its measxarement corresponds 
to other measurements of the same trait. Convergent validity would 
he evidenced in high positive correlations with other tests 
measuring the same trait. That is, correlations between task 
scores and paper-pencil scores should be high, positive and higher 
than those mth intelligence scores. 

136 

o 

ERIC 



117 

Convergent validity would "be evidenced in. results that 
compare vith the res\ilts of other researchers. That is, the 
proportion of pefisons measured to be formal operational should 
correspond to the proportions reported in the literature. There 
should "be noted a positive correlation "between proportional 
reasoning level and age (irihelder and Piaget, I958; Ka3:plus and 
Peterson, I97O; Lawson, 1973; Hensley, 197^). 

Convergent validity would be evidenced in the identity of 
components of proportional reasoning. That is, components of 
proportional reasoning should account for much of pupil achievement 
and intelligence. Pearson correlation coefficients with 'task 
scores for the thirty- five person sanEDLe talcing both tests and tasks 
were: paper-pencil tests, .36; Lorge-Thorndike verbal, .35; Lorge- 
Thorndike nonverbal, .53* 

The proportions of eighth grade pupils successful at each 
level reported in this test were: Level I, 77 per cent; Level II, 
56 per cent; Level HI, 36 per cent; and Level IV, 13 per cent. 
Corresponding values reported for a sanrple of 75 eighth to tenth 
grade pupils were: Levels I and II, 1^9^ and Levels Jll and IV, 
36 per cent (Kaxplus and Peterson, I970). For a sample of 30 
eighth grade ptrpils, the results were: Level I and below, 100 per 
cent; Level II, 70 per cent; Level III, 20 per cent and Level IV, 
one per cent (Hensley^ I97I1).' 

The correlation between test rating and age was fovmd to 
be -.01198, which was not statistically significant at the .05 level. 

137 



118 

The age correlation of other researchers cited was reported over 
ranges of ten to thirty years. The age range of the sample vas 
about one year. 

A principal coniponents analysis identified two principal 
components. The first accounting for hh^Q per cent of the variance 
the second U.7 per cent. The first conrponent loads heavily on 
measures of piapil achievement and intelligence* The test had 
acceptable convergent validity by these measures . 

Suminary of Vetlidity 

In summary, the test had high content validity, acceptable 
concurrent validity, good construct validity, high discriminant 
validity and acceptable convergent validity. 

Reliability 

Reliability is concerned with the fact that repeated 
meas\ires should duplicate each other (Stanley, 1971). Measures of 
reliability center on the variability of response. In a criterion- 
referenced test, then reliability may have a special meaning. As 
a criterion for reliability, it was expected that the same person 
or coiDEparable person taking the paper-pencil instrument or a com- 
parable paper-pencil instrument should exhibit a comparable per- 
centage of mastery. A classical one-form reliability measure 
(Hoyt, 19*H) was calculated. Individual pupil scores and the total 
number of correct responses were used. The reliability coefficient 



V 



■■; 119' \ 

equivalent to the Kuder-Richaxdson Twenty value, vas .78. Data 
and calculations of this are in Appendix C. 

In a second approach, the criterion-referenced nature of 
the testing and the scoring by 'category were acknow3,edged and 
Livingston's (1972) approach was used. 

This approach afforded a correction for the criterion level 
and the variance limitation of criterion-referenced testing. I'he 
relationship used was: 

r^ C?^x^(x) + c^f 
(x) + (X - C^)^ 

where : 

r^ = criterion-referenced reliability 

r J- = classical measure of reliability (Hosrt, 19'+1) 
^ variance of the test scores 

X = mean of test scores 

C — criterion level 

The criterion-referenced reliability thus obtained (r^) was 
.8'+, when the criterion level C was taken eis I5. This was the level 
value for assignment of pupils to be either concrete or formal level 
proportional reasoners. Calculations may be found in Appendix C. 

The reliability of the test, .8^^^, compared favorably with 
other attempts, which ranged from .23 to .76, in the literature. 
Using Spearman-Bro;^ split half measures, Lawson and Renner (1975) 
reported r^^ = .76 fqf'^'^a:* biology reasoning level test, rjj = .71 for 
a chemistry reasoning level test and rjj = .59 for a physics reasoning 
level test. DeAvilla and Struthers (I967) used Cronbach's alpha 

139 



lao';:'- 

measure of reliability and reported these results for a set of 
cartoon format paper-pencil tests: conservation, .69^*-; causality, 
.55OJ relations, ,001; logic, .227; and total test, •717. 

Reliability vas also measured on a test-retest "basis and 
analyzed with the tetrachoric correlation coefficient and the Pearaon 
correlation coefficient (Nie, et al., 1975)* The tetrachoric measure 
(r^) relates the reliability of the test to discriminate concrete 
and formal proportional reasoning levels. The Pearson correlation 
coefficient describes the relation of test-retest scores on the 2k 
test items. 

The relationships \^^ere: r^ = .ko mid x ^ • 68 for a population 
of 9I1 fifth grade pupils; r^ = .70 and r = f or a population of 
kl9 eighth grade pupils and r^ = .32 and r - M7 for a population 
of ih^ eleventh grade chemistry pupils. Past testing had suggested 
that such fifth grade pupils wuld be largely non-masters of formal 
level proportional thinking j eighth grade pupils would be at the 
• transitional stage between concrete and formal level proportional 
thinking and eleventh grade chemistry pupils would be masters of ^ 
formal proportional thinking, in the manner suggested by Zeiky 
(197^), a sample of 338 fifth grade, eighth grade and chemistry 
pupils was randomly selected from those tested to comprise a sample 
of approximately -equal numbers of probable non-masters, transitional 
and masters. This composite sample test-retest relationships were 
r^ = .Qk and r = .83. Appendix C contains the calculation data 
for these values. 



140 



Sviinmary of Reliability 

In summary, the test has high reliability as a criterion- 
referenced test. This reliability supports its use as an excellent 
group measure of proportional reasoning and a good individual 
measure of proportional reasoning. 

Item Difficulty 

Piaget has described developmental levels of proportional 
reasoni::^ (Inhelder and Piaget, 1958)^ Th^ successive develop- 
mental llisvels require progress iTrely more sophisticatedrreasoning. 
It vas expected that the paper-pencil- items would show Jjicr easing 
difficulty as the higher levels we^re meastired. It was also 
expected that within a level item difficulties would he similar. 
Table 6.3 presents these item difficulties in terms of the 
percentage of grade eight j)tiE)il3 front Oak Grove Junior High School 
getting the item correct. There was increasing difficiaty with 
higher levels as expected, 5?he average percentage of piapils 
getting items correct hy levels was: Level I, 65 per cent; Level II 
55 per cent; Level III, U9 per cent; and Level IV, 3*+ per cent. 

Item Distcrimination 

It was expected that items selected for the test should 
demonstrate discrimination between masters and non-masters such 
that: 

1) differences in percentages correct should be in 
agreement with the measxired reasoning level of 
the pupils (see Appendix E) 

141 



122 

Table 6.3 

Item Difficulties in Terms of Performance for U27 Grade 8 Pupils 



Item in Final Percentage Getting Average for 

Level Test Version Item Correct Level 



1 


68 


5 


71 


20 


72 




59 


9 




k 


57 


21 


67 


10 


55 


18 


69 


lU 


35 


8 


50 


2 


53 


7 


k6 


23 


3h 


17 


55 


11 


57 


13 


39 


3 


60 


16 


33 


19 


37 






10 


28 


22 


33 


6 


26 



65^ 



III ; hs$ 



2) r "biserial values of .50 or above shoxild "be 
reported "between masters and non-masters of 
items 

3) item distract ors selected "by a pupil sho\xld 
match the pupil* s reasoning level 

Table S.k presents the percentage of correct item responses 

of pupils at five* proportional reasoning levels. The 0 level 

represents a pupil who was imsuccessful at achieving foxir or more 



142 



Table 6A 

Percentage of Correct Pupil Responses in Relation to Pupil Tested Reasoning Level 



Questions for Questions for Questions for . Questions for 
Level I Level II ' ..Level III . Level IV 



AU rt7 


68 71 72 59 6i^57 


67 55 69 35 50 53 . 


^ 31^ 55 57 39 60 


33 37 ^5 16 25 26 


0000 
Level 0 
N=99 


29 2911825^^117 


3631 39183036- 


27 19 2lt28l6 29 


2li 26 20 10 15 27 


1000 
Level I 
N=71 


69 82 80 80 65 70 


5331 '^l 830IH 


20 21lj5 38 2l^ l^6 


2li 27 3^ 6 2lH5 


UOO 

Level II 
1I=62- 


90 82 81f7^ 73 8l 


9281 9^056965 


2616535827 59 


21 31 58 16 21 15 


1110 

Level ni 

Ul 


96 91196798^79 


9lf73 97 63 Bl 79 


8161789367 91 


36 39 63 13 22 16 


1111 
Level IV 
N=23 


100 100 96 83 87 96 


100 91 100 65 71^ 83 


91 65 83 87 83 100 


91 91 83 57 57 70 



143 



vi2h 

correct responses at any of the four proportional reasoning levels: 
1 - Concrete 1,2- Concrete II, 3 - Foim I, or i|- - Formal II. A 
Level I pupil achieved four or more correct responses at Level I 
hut failed criterion achievement at other levels, 1000. A Level II 
pupil achieved fovoc or more correct responses at hoth Levels I and 
II, but failed criterion achievement at Levels III aiid I\r, 1100, ' 
and so on for Level III, 1110 and Level IV, .1111. The sharp 
discrimination across the level ms evidient at the line on the 
table separating the master eind non-master levels. This line for 
questions in Level H shows that level respectively 53/ 31, ^1, 8, 
30 and kl per cent of Level I pupils correctly aJiswered these 
questions while 92, 8l, 9if, 55, 69 and 65 per cent of Level II 
pupils respectively correctly answered them. Clearly the item 
collections were capable of discriminating the masters from the non- 
masters . 

As an item discrimination index the biserial r correlation 
coefficient, r^^g, was calcvilated for each item. It was expected 
that these values would be .50 or greater. As reported in Table 6.5, 
only six of the twenty-four items failed to meet this criterion. 
Test items had good discrimination according to this measure. 

Item design required that the key, or correct answer, and 
the distractors, or other einswers, all be vnritten at different 
reasoning levels. This was intended to make the correct answer 
and other answers appeal to„persons at each reasoning level. 
Level XV items had answers appropriate to all four reasoning 



145 



Table 6,5 
Item Discrimination 



Level Item r Biserial T Value Significance df' 



I 


1 

5 
20 

15 
9 

2k 


.5992 
.5557 
.5673 
.5809 
.M+20 
.61+73 


15.1+292 
13.7778 
1I+.2011 
1I+.7110 
10.1571 
17.5075 


< .001 


II 


21 
12 
18 
1J+ 
8 
2 


.5620 , 

.51+71 

.6057 

.1+880 

.5061 

.1+959 


1I+.OO85 
13.1+731 
15.6926 
11.5266 
12.0961 
11.7713 


< . 001 


III 


7 

23 
17 
11 
13 
3 


.5871 
.1+592 
.5352 
.5797 
.5291 
.5780 


1I+. 91+97 
10.6555. 
13.0616 
1I+.6676 
12.8531 
1I+.6031 


< . 001 


IV 


16 

19 
J+ 
10 
22 
6 


.5581+ 
.5317 
.1+773 
.1+527 
.521+3 
.1+595 


13.8763 
12.91+11 

11.1979 
10.1+673 

12.69I+3 

10.661+6 


< .001 



levels as illustrated in the problem below: 

19. A freemy driver keeps track of the distance he travels. He 
finds that in 1+ minutes he travels 3 miles/ in 10 minutes 
7^ miles. If he continues at this speed, how long will it 
take him to travel 10 miles? 

Distance Time 

^ miles k min. 
7i miles 10 min, 
10 miles :? min. 



146 



126 

A. About 13 minutes because Level IV F^.rmal II 
k min. _ 10 ndn> _ 13 l/3 min/ 

3 miles "^7.5 miles 10 miles 

B. About 13 minutes because Level II Concrete II. 

10 - 7% = 2^ miles and 
10 + 2| = 12^' rain. 

C. About 13 minutes becaxise Level III Foimal I 

^ X 10 = 13 1/3 

D. About lU minutes because Level I Concrete I 

7i + 3 = lOi and 
10 + U = li^- . 

I have no ansver. Level 0 ^ 

A more coniplete discussion of this item design may be found in 
Chapter 5 • . 

A cross tabulation was made of item responses with p\ipil 
levels for each item in level 17. For item 19 the cross tabulation 
•was that found in Table 6.6. In the table it may be read that for 
58 pupils of Level III, four selected a Level 0 response, eight 
selected a Level I response, thirteen selected a Level II response, 
fifteen selected a Level m response and only eight selected a 
Level IV response. 

These cross tabulations suggested that the item design 
worked. Pupils did select answers appropriate to their reasoning 
level. Table 6^7 shows that for only items four and six was this 
selection pattern not significant above the .001 level. 



147 



Tatble 6.6 

Cross Tabttlatiou of Pupil Response and Pupil Level for Item 19 . 



Prcpil Response Level Totals 



Le^el 


0 


I 


II 


III 


IV 




0 


11+ 


8 


18 


10 


17 


67 


I 


5 


13 


13 


9 


11 


: 51 


II 


5 


13 


9 


8 . 


16 


51 


III 


I* 


8 


13 


15 


8 


- 58 


.IV 


_o 


J. 


_1 


JL 


23 


26 


Totals 


28 


1+3 




1+3 


85 


253 


Chi-square = 56.I6 with I6 degrees of freedom 
Significant at < .00001 



Ta'ble 6.7 



Cross Tabulation Significance for lev^l IV Items , ■ 



Item in Final 
Test Version 


Chi-square 


Significance 


16 


56.1+65 


■ ■'■;<v.oool/.:' 


19 


56.161 


< .0001 


10 


52.159 


:-,..x.-;ioooi'-f^-:: 


1+ 


27.1+56 


.0367 


22 


78.902 


< .0001 


6 


39.668 


.0055 



148 



Summary 

The test instrument appeared to. have high content validity 
and good construct validity. Reliability of the instrument vas 
good. Items ^vrer^ excellent in their discrimination and generauLly 
appropriate in difficulty. 



149 



CHAPTER 7 
CONCLUSIONS 

Review of Purpose and Procedure 

The purpose of this study was to develop a paper-pencil 
instrument to evaluate pupil proportional reasoning levels and to 
demonstrate how the application of principles of criterion- 
referenced test design could be used to "build, validate and use 
such a test* 

Individual task-testing of a representative groirp of forty 
pupils ws used to establish a reference group for paper -pencil 
testing and to determine probable topics for test items* Paper- 
pencil testing of pupils who by reason of age were essimed to be 
non-masters^ at the transitional stage, and masters was conducted* 
Analysis of item responses after each testing was used in item 
improvement. 2027 pupils were tested in arriving at the final 
test and the description of its characteristics* Five major 
revisions were made of the item sets comprising the test* The 
final test form consisted of twenty- four items with- four subtests 
each of six items for Piaget levels Concrete Operational I, 
Concrete Operational II, Foimal Operational I and Formal 
Operational- II * The final test was completed by 90 per cent of 
the pupils in a SO-'Riinute testing period* 

129 

150 



Findings 



130 



The final test version was analyzed to describe the test 
characteristics. It was found that: 

1) The paper-pencil test resiilts correlated with the 
initi€il task results of a group of 35 pupils taking 
"both tests. A value of .36 was obtained for the 
three task average and the final test scores. 

2) Content, concurrent construct, divergent and 
convergent validity were established for the paper- 
pencil test. The test by all measures must be 
considered valid. 

3) Reliability was assessed by the Kuder-Richardson-20 
approach as modified by Hoyt. The reliability 
coefficient .77 suggested good reliability for the 
test. Reliability, calcxilated according to 
Livingston (1972) for criterion-referenced test, was 
.8U. The .ok value suggested that the test had high 
reliability. 

Reliability calculated from test-retest results 
established a Pearson value of -83 for overall 
reliability and a value of .8^ for the discrimination 
of formal and concrete levels. 



k) Good Item discrimination between proportional 

reasoning levels was established. The item design 
utilizing correct answers but different reasons was 
successful. 

5) Pupil levels of proportional reasoning detemined in 
the testing agree with those of other researchers 
(Hensley, 197^; Lawson, 1973; Karplus and Peterson, - 
(1970). In contrast with Inhelder, piaget^s (I958) 
results, lower proportions of thirteen-year-olds 
were found to be formal operational in proportional 
reasoning in this study than in that of Piaget. 



Educational Implications 



The results of this study tended to confirm the study of 
Gray (1970) who found that paper-pencil measures of Piaget levels 



151 



of cognitive development may be developed and that criterion- 
referenced test theory of Hambleton and Novick (197*+) is effective 
in test design. 

Efforts for paper-pencil tests of Piaget measiires in 
other areas of cognitive development coiild be developed following 
the strategy used in this study. Control of variables, higher 
order proportions, causal relationships and functions are exairrples 
of areas certain to be of interest in science education. 

The group test of this study and others like it should be 
used by teachers in evaluating the level of proportional reasoning 
in their classes, it has been expressed as a concern (Almy, 1973), 
that teachers recognize the level of thinking of their pupils i 

Present science curricula, resulting from the activities of 
the sixties, do demand foraial reasoning. The Piaget levels 
required in the science process skills are formidable (Wood, lS7h) . 

This measurement tool and others developed in this manner 
should aid teachers in locating the level of their pupils' cognitive 
development, in an era where broad range achievement and intel- 
ligence tests are xinder criticism, such a specific measure would 
aid in diagnosis. The large scale testing possible with this 
paper-pencil instrument will support improvement in curricvila, 
teaching stragegies and organization for instruction. 

cur ricxaum design needs attention. Measures of pupil 
cognitive development are needed. Group testing with this test 
and others to determine both the range and mode of these levels 



. 132 

world provide a solid base for curricul\uii design and woald help in 
correcting par^t errors. 

Limitations of the Study and Suggestions 
for Further Research 

This study vras limited to the development of a paper-pencil 
instrument to measure proportional reasoning in ei^th grade pupils • 
Research is needed in the applicability of this instanaraent over a 
broad range of pupil ages. The original attention to reading'^level 
and empirical improvement of items vfould have to be repjsated with 
large groups of pupils at the levels to be tested. Longitudinal 
studies of cognitive development with a group paper-pencil measure 
would then be possible. 

The results of the study indicate that the test is a valid, 
reliable measure over the populations tested. Testing across other 
socioeconomic and ctiltural groups would extend the generality of 
the test. Some task testing to establish performance traits, 
additior^al items for item itrrprovement would be necessary. The 
item improvement computer programs used in this study would support 
additional items for alternative selection. 

This study was directed toward the development of a single 
paper-pencil instrument to measure proportional reasoning. Con- 
tinued large scale use would allow the development of alternate 
forms through which further reliability measures could be made and 
cvirriculum research supported by pre-post testing with these 
alternate forms. 



153 



133 

The proportional reasoning measure developed in this study 
should be complemented hy the development of parallel meastires 
including control of variables and logic. The test development ' 
strategy could follow that vrhich proved to be successful in this 
study. 



154 



SELECTED BIBLIOGRAPHT 



Adams, J,, et al, (1975) • Achievement of Minnesota Students, in " 
Mathematics . St, Paul: Minnesota Department of Education 
Office of Statevri.de Assessment. 

Ahlgren, A- Remarks delivered at AAPT Convention, Februairy 3, 
1969, P- 2. 

Airasian, P. W*, and W. M. Bart. (1975). Validating a priori 
instructional hierarchies. Joirrnal of Educational 
Measurement , 12:163-175. 

Aljny, M. (196l|.), Wishful thinking about children's thinking? 

In W. A. Fullagur, H. G. Lewis and C* P. Cumber, Readings 
for Educational Psychology . New York: CroweU, Pp. 389- 

iioi]! ■ . - •;: 

• (1970) • Longitudinal^ 

In M. F. Rosskopfi et al., Piagetieui Cognitive Development 
Research and Mathematical Education . . Washington: National 
Council of Teachers of Mathematics. ED 07771*+. 

Almy, M. , E. Chittenden and P. Miller. (1966)* Young Children's 
Thinking , New York: Teachers College Press • 

Ausubel, D. (I965). Some psychological and educational 

limitations of learning by discovery. The Arithmetic 
Teacher , 12:290-302. 



Ball, D. W. aJid S. A. Sayre. (I972). Relationships between 

student Piagetian cognitive development and achievement in 
science. Unpublished Ed. D. dissertation, Uhiversity of 
Michigan, Ann Arbor. 

Bart, W. M. (I97I). The factor structure of formal operations. 
British Joxjrnal of Educational Psychology , Ul: 70-77. 

. (1972). Construction and validation of formal 

reasoning instruments. Psychological Reports , 30:663-670. 

Beistel, D. W. (1975). A Piagetian approach to chemistry. 
Journal of Chemical Education , 52:151-152. 

Besel, R. (1973). Using group performance to interpret individual 
responses to criterion-referenced tests. SWRL Professional 
Paper 25. 



Bridgham, R. G. (I969). Classification, seriation and learning 
of electrostatics. Joiirnal of Research in Science 
Teaching / 6; 118-127 > • 

Carpenter, T. P., et al. (1975a). Notes ifrom national assessment: 
"basic concepts of area and volxime. The Arithmetic Teacher" , 
22:501-507. 

. (1975b). Results and implications of the NAEP 

mathematics assessment: secondary school. The Mathematics 
Teacher , 68 : 6 . 

Carver, R. P. (I97O). Special problems in meas\xring chargeVith. 
psychometric devices . In Evaluative Research: Strategies 
and Methods . Pittsburgh: American Institute for Research. 

Chittenden, E. A* (197^). Personal conversation at Educational 
Testing Service, Princeton, February, 197^1. 

Clemenson, R. W. (I97O). A comparative study of three fifth grade 
classrooms on five selected Piaget type tasks dealing vith 
science related cbncepts. Unpublished Ph. D. dissertation. 
University of Iowa* 

Copeland, R. W. (197^^). Hov Children Learn Mathematics . New York: 
Macmillan. 

Cronbach, L. J. (I960). Essentials of Psychological Testing 
(2nd ed.). New York: Harper. ' Pp. 23, 25. 

' (1971). Test validation. In R. L. Thomdike, ed.. 

Educational Measurement . Washington: American Council on 
Education. - ■■ 

Dale, L. G. (1970). The growth of systematic thinking: replication 
of Piaget* s first chemical experiment. Australian Joximal 
of Psychology , 22:277-286. 

Darley, J. G. and G. V. Anderson. (I95I). The functions of 
measurement in counseling. In E. F. Lindquist, ed., 
Educational Measurement . Washington: American Council on 
Education. Pp. 68-84. 

DeAvi3J.a, E. and J. A. Struthers. (I967). Development of a group 
measure to assess the extent of pre-logical and pre-causal 
thinking in primary school age children. Paper presented 
at the 1967 Annual Convention of the National Science 
Teachers Association. ED OI9136. 



,. < • 

' 7156 



DeStefano, J. (1973). Linguistics and logical reasoning. Theory 
into Practice 3 12(5) :272-277. 

DeVries, (1973a) • Relationships among Piagetian leveis', 

achievement and intelligence. Paper presented at the . 
American EducationsLl Research Association Meeting, New 
Orleans, March 1/ 1973, ED 079101. 

(1973b). The two intelligences of bright, average and 



retarded children. Paper presented at the Biennial Meeting 
of the Society for Research in Child Development, 
Philadelphia, March 29, 1973* ED 079102. 

Easley, J, A. (197^). The structxxral paradigm in protocol 

analysis. Journal of Research in Science Teaohing j 11:281- i 

■ 290. • ' ^ ~ ' ~~~ ~ : ■ , 

Ebel, R. L. (1971). Criterion-referenced measurements : limitations. 
' ' School Review , 69:282-288. 

Elkind, D. (I961). Quantity conceptions in junior and senior 
high s chool s tucient s . Child DeveloQpment ^ 32 : 5 5I7 560 , 

. (1962). Quantity conceptions in college students. The 

Journal of Social Psychology , 57:^59-'+65. 

. (1975). Piaget. Human Behavior, U: 25-31. 



Emrick, J. A. (1971). An evaluation model for mastery testing. 
Journal of Educational Measurement , 8:U. 

Fehr, H. F. (197^) • The secondary school mathematics cxirriculum 
improvement, study : a unified . mathemat ic s .program . . -The 
Mathematics Teacher. 67:25-30. 

Flavell, J. H. (I963). The Developmental Psychology of Jean 
Piaget. Princeton: D. Van Nostrand. 

Fremer, J. (1972). Criterion-referenced interpretations of 

survey achievement tests. Test Development Memorandum. 
Princeton: Educational Testing Service. 

Ginsburg, H. and S. Opper. (I969). Piaget' s Theory of 

Intellectual Development . Englewood Cliffs, N.J. : 
Prentice Hall. 

Glaser, R. (I963). Instructional technology and the measurement 
of learning outcomes. American Psychologist ^ 18:519-521. 



157 



Glaser, E, and R. C. Cox, (1968), Criterion-referenced testing for 
the measurement of educational outcomes. In R, Weisgerber, 
ed, , Instructional Process and Media Innovation , Chicago: 
Rand McNally. Pp» 5^5-550. 

Glaser, R, and A. J. Nitko. (1971). Measixrement in learning and" 
instruction. In R, L. Thorndike, ed,. Educational 
Measurement . Washington: American Council on Education. 
Pp. 525-670. 

Glass, C. V. and J. C. Stanley. (1970). Statistical Methods in 
Education and Psychology . Englewopd Cliffs, N.J.: 
Prentice Hall. 

Goodyear, J, and J. Eenner. (1975)* The multiple -choice test in 
the science classroom. The Science Teacher , l|2:32-3l|. 

Grant, N. and J. Renner. (1975). Identifying types of thought in 
tenth grade biology pupils. The American Biology Teacher , 
37:283-286. 

Gray, M. (I97O). Children's performance on logically 

equivalent Piagetian tasks and witten tasks. Doctoral 
thesis. Ann Arbor: University Microfilms. 

Green, B. P. (I956). A method of scalograra analysis using summary 
statistics. Psychometrica , 21:79-88. 

Green, D. R., M. P. Ford and G. B. Flamer, eds. (I971). Measure - 
ment and Piaget . New York: McGraw Hill. 

Guttman, L. (19^*+)* A basis for scaling qualitative data. 
American Sociological Review ^ 10: 255-282. 

. (19'+7)« Cornell scale and intensity analysis » 

Educational and Psychological Measurements , 7: 2^7-279 , 

Hall, v. and R. Kingsley. (I968). Conservation and equilibration 
theory. Journal of Genetic Psychology , 111:195-213. ^ 

Hambleton, R. K. and M. R. Novick. (I972). Toward an integration 
of theo^ry and method for criterion-referenced tests. 
American College Testing Program Research Report 55 ^ Iowa 
City: American College Testing Program. 

Harris, C. W. (1972). An interpretation of Livingston* s 

reliability coefficient for criterion-referenced t.;5sts. 
Journal of Educational Measiirements, 9:27«29. 



158 



Hensley, J. H. (197^). An investigation of proportional thinking 
in children from grades six through twelve. Unpublished 
doctoral thesis. University of Io\«rei. 

Herron, J. (1975). Piaget for chemists. Journal of Chemical 
Education , 52:li|6-150. 

Hiercnymus, A, N. (I97I). Today's testing: what do ve know how 

to do. Address, American Educational Research Association 
Meeting, Minneapolis. 

Higgins-Trenk, A. and A. J. H. Gaite, (I97I). The elusiveness of 
formal operational thought in adolescents. Paper presented 
at 79th meeting of the American Psychological Association, 
'Washington, D.C., September k. ED O63972. 

Hively, W., H./L. Patterson and S, A, Page, (1968), Universe 

defined system, of arithmetic achievement tests. Journal 
of Educational Measurement , 5i275-290, 

HolloTfay, G. E. T. (I967). An Introduction to the Child^s 
Concept of QeoLietry . New York: Humanities Press, 

Howe, A. (157^). Formal operational thought and the higji school 
science curriculurr.. Paper presented to the NARST, Chicago, 
A.pril, I97U. ED 09236^, 

Hoyt, C. J. (1952). Estimation of test reliability for 

unrestricted item scoring methods. Educational and 
Psychological Measurements ^ 12:756-7557" ~ 

Inhelder, B. and J. Piaget. (I958). The Growth of Logical 

Thinking from Childhood to Adolescence . New York: Basic 
Books » 

. (1969). The Early Groxvth of Logic in the Child : 

Classification and Seriation . New York: Norton. 

ivens, S. H. (I97O). An investigation of item analysis, 
reliability and validity in relation to criterion- 
referenced tests. Unpublished doctoral dissertation, 
Florida State University. 

Jackson, S. (I965). The growth of logical thinking in normal and 
subnormal children* British Journal of Educational 
Psychology , 35:255-25^1 

Jensen, J. (1973)* A comparative investigation of the casual ahd 
careful oral langtiage styles of average and superior fifth 
grade boys and girls. Research in the Teaching of English , 
7:223-250. ~ ' 



159 



Kaarplus, E. and R.^ Karplus. (1970). Intellectual development 
"beyond elementary school. I: deductive logic. School 
Science and Mathematics ^ 70:398'''+06. 

Karplus, R. and E. Karplus. (1972). Intellectual development 
"beyond elementary school. Ill: ratio, a longitudinal 
study. School Science and Mathematics ^ 72:735-71+2. 

(197*+) • Proportional reasoning and control of variables', 



Unpublished paper. Cambridge: Massachusetts Institute of 
Technology. 

Karplus, R..and R. Peterson. (I97O). Intellectual development 
"beyond elementary school. II: ratio, a survey. School 
Science and Mathematics ^ 70:813-820. 

Karplus, R., E. Karplus and W. WoUman. . (1971+). Intellectual 
development "beyond elementary school. IV: ratio, the 
influence of cognitive style. School Science and 
Mathematics , 7I+: J+76-i^2. 

Kaufman, B. A. and R. Konicek. (197'+)» The application of Piaget 
to contemporary curriculum reform. Paper presents to the 
National Association for Research in Science Teaching, l|-7th 
Annual Meeting, Chicago, April, 197^. 

Kavanagh, D. C. (I97U). An investigation of a model hierarchy 
for the acquisitipn of the concept of speed. Paper 
presented to the National Association for Research In 
Science Teaching, AnnuaJ. Meeting, Chicago, April, 197^* 

Keasy, C. (1971) # The nature of formal operations in pre- 
adolescence, adolescence and middle age. 
doctoral dissertation. University of California, Berkeley. 

Kohlberg, L. and C. Gilligan. (1971). The adolescent as a 
philosopher. Daedalus , 100:1051-1086. 

KrievaU, T. E. (I969). Applications of information theory and 
acceptance sampling principles to the management of 
mathematics instruction. Unpublished doctoral dissertation. 
University of Wisconsin* 

Kulm, G. (1973). Soiirces of reading difficulty in elementary 
algebra textbooks. Mathematics Teacher , 66:6^9-652. 

Laurandeau, M. and A. Pinard. (1962). Causal Thinking in the 
Child . Nev7 York: International University Press. 



160 



Lawson, A. E. (1973). Relationships "between concrete and formal 
operational science subject matter and the intellectual 
level of the learner. Unpublished doctoral dissertation. 
University of Oklahoma. 

. (197U) Relationships of concrete and formal operational 

science subject matter and the developmental level of the , 
learner^ Paper presented at the National Association of 
Research in Science Teaching Convention, April, IS^k. 

Lawson, A. E. and J. W. Renner. (1975). Relationships of science 
subject matter and developmental levels of learners. 
Journal of Research in Science Teaching , 12 :31^7-350. 

Lawson, A. E., F. H. Nordland and A. DeVito. (I975), Relationship 
of formal reasoning to achievement, aptitudes and attitudes 
in preservice teachers. Journal of Research in Science 
Teaching , 12:^3-1^31. 

Linn, M. and H. Thier. (I975). The effect of experiential science 
on development of logical thinking in children. Journal of 
Research in Science Teaching , 12 ; 59-62 . 

Livingston, S.' A. (1972). Criterion-referenced applications of 

classical test theory. Journal of Educational Measurement , 
9:13-23. 

Lovell, K. (1961).. A follow-up study of Inhelder and Piaget's 
the groxrth logical thinking. The British Journal of 
Psychology , 52:llf3-153. 

. (1970). Proportion and probability, in M. F. Rosskopf, 

et al., Piagetian Cognitive Development Research and 
Mathematics Education . Washington: NationalTcouncil of 
Teachers of Mathematics. 

Lovell, K. and I. B. Butterv7orth. (I966). Abilities underlying 

the understanding of proportionality. Mathematics Teaching , 
37:5-9. ; 

Lovell, K. and J. B. Shields. (I967). Some aspects of a study of 
the gifted child. British Journal of Educati onal 
Psychology , 37:201-205: ' 

Lunzer, E. A. (I965). Problems of formal reasoning in test 
situations. In p. H. Mussen, ed.-. Monographs of the 
Society for Research in Child Development , Eu ropean 
Research in Cognitive Development, 30:19^6. 



161 



Lunzer, E. A. and P. Pumfrey. (I966), Understanding proportion- 
ality. Mathematics Teaching ^ 3l^-:7-12. 

Ltuizer, E. A., C, Harrison and M. Davey. (1972). The fourKsard 

problem and the generality of formal reasoning, Quajrberly 
Journal of Psychology ^ 2i^:326-339. 

McCorrnanV.. A. J. and V. Bybee. (1970). Piaget and the 

traitdng of elementary science teachers. Address at HSTA 
Convention, Cincinnati, Ohio, March 12-16. 

McKinnon, J. W. and J. W. Renner. (197I). Are colleges concerned 
vith intellectual development? American Journal of 
Physics , 39:101^7-1052. , 

McLeod, H., G. Berkheimer, D. Fyffe and R. Robison. (1975). The 
development of criterion- validated test items for four 
integrated science processes. Journal of Research in 
Science Teaching , 12:lfl5-lf21. 

Mall on, E. J. (197^). Cognitive development and processes: 

review of the philpsOT The. American 

Biology Teacher , 38:28-33. 

Mandell, A. (197'+). The Language of Science . Washjjigton: 
National Science Teachers Association. 

Margenau, H. (1950). The Nature of Physiceil Reality . New York: 
McGraw-Hill. 

Mehrens, W. A. and I. J. Lehman. (1972)<, Measurement and 

Evaluation in Education and Psychology . New York: Holt. 

Meyers, S. S. (1970). Questions illustrating the kinds of 

thinlcLng required in current mathematics tests. Princeton: 
Educational Testing Service. 

Mink, 0. G. {196k). Experience and cognitive structixrea in 
R. E. Ripple and V. N. Rockcastle, eds., Piaget 
Rediscovered . Ithaca, W.Y. : Cornell University. 

Mogar, M. (I96O). Children's causal reasoning about natural 
phenomena. Child Development , 31:59-65. 

NislDet, J. D., et al. (1961f). Puberty and test performance, 
British Journal of Educational Psychology ^ 3*+: 202-203, 



162 



Nitko, A. J. (197^). Problems in the development of criterion- 
referenced tests: the IPI Pittsbiirgh experience. In 
Harris, Alkin and Popham, eds,, CSE Monograph Series 
in Evaliiation , Los Ajogeles: Center for the Study of 
Evaluation, 



Nitko, A. J. and T. Hsu. (197^) • Using domain-referenced tests 
for student placement, diagnosis and attainment in a 
system of adaptive, individualized instruction. 
Educational Technology , llfiW, 

Novak, J. D. (197^) • Summary of science education research. A 
paper presented at the 197^ NAEST Convention, Chicago. 

Osborne, A. R. (1973)* Promoting logical ability. Theory into 
Practice, 12:286-291. 

Osiki, K. J. (197^) • A coniparison of affective and cognitive 
development in elementary school students. A paper 
presented at the 197^ NARST Convention, Chicago. 

Phillips, D. R. (197^). Formal operational thought and dogniatism. 

Paper presented to the National Association for Research in 
Science Teaching, kjth Annual Meeting, April, 197^1-, Chicago. 

Phillips, D. G. (197^). Changing teachers' perception of 
"learning": an application of Piaget's theory and 
experiments. Address at the National Association for 
Research in Science Teaching, kjth annual meeting, April, 
I97U, Chicago. 

Piaget, J. (1926). The Language and Thoiight of the Child . 
London: Kegan Paul. 

. (I96I1). Development and learning. In R. E. Ripple 

and V. N. Rockcastle, eds., Piaget Rediscovered . Ithaca, 
W.Y.: Cornell University. Pp. 7-20. 

. (1970). The Child' s Concept of Motion and Speed . 

New York: Ballantine Books. 

. (1972). Intellectiial evaluation trom adolescence to 

adulthood. Human Development , 15:1-12. 

Piaget, J. and B. Inhelder. (I963). The Child^s Conception of 
Space . London: Routeledge &s Kegan PavQ. 



163 



P iaget , J . and B . Inhelder . ( I969 ) . " The Psychology of the Child > 
New York: Basic Books. ' ' " ^ 

- (1971). Mental Imagery in the Child . London: Routeledge 

8c Kegan Paul. 

, Popham, W. J. and T. R. Husek. (I969). Inrplications of criterion- 
referenced measurement. Journal of Educational Measurement. 
6:1-9. ' 

Raven, R. J. (I972), A multivariate analysis of task dimensions 

related to science concept learning dif fic\ilties in primary 
school children. Journal of Research in Sc ience Teaching, 
9 : 207-221 . ^ 

• (197^). Programming Piaget's logical operations for 

science inquiry and concept attainment. Journal of Research 
in Science Teaching , 11:251-261. 

Reichard, S., M. Scheiden and D. Rapaport. (19lflf). The develop- 
ment of concept formation in children. American Journal 
of Orthopsychiatr:^ , llf:152-l62. 

Renner, J. W. and A. E. Lawson. (1973). Piagetian theory and 
instruction in physics. Physics Teacher ^ Il:l65-l69. 

Ripple, R. E. and V. N. Rockcastle, eds. (l^Sk).' Piaget, In 
Piaget Rediscovered . Ithaca, N.Y.: Cornell University. 

Robertson, W. W. and E. Richardson. (1975). The development of 

some physical science concepts in secondary school students. 
Journal of Research in Science Teaching ^ 12:319-329. 

Rosskopf, M. F., et al. (1970). Piagetian Cognitive Development 
Research and Ifethematics Education . Washington: National 
Council of Teachers of Mathematics. "ED 077711^.. 

Rowell, J. A. and P. j. Hofftaan. (1975). Group tests for 

distinguishing formal from concrete thinkers. Journal of 
Research in Science Teaching , 12:157-l6U. ^ 

Sayre, S. A. and D. W. Doll. (1975). Piagetian cognitive 
develcfpment and achievement in science. Journal of 
Research in Science Teaching , 12:11+7-156. 

Shepler, J. L. (I969). A Study of Parts of the Development of a 
Unit on Probability and Statistics for the Elementary ' 
School > Research and Development Center for Cognitive 
Learning, Report No. I05. Madison: University of V/isconsin. 



161 

EKLC 



Smeslxmd, J. (1961+). Internal necessity and contradiction in 

children's thinking. In R. E. Ripple and V. N. Rockr.jstlic 
eds., Piaget Rediscovered , Ithaca, N.Y. : Cornell Univrerj?!* 

Stanley, J. C. (1971). Relia*bility. In R. L. Thomdike, ed.. 

Educational Measurement , Washington: American Council on 
Education. 

Steffe, L. P. and R. B. Parr. (I968). The Development of the * 
Concept of Ratio and Fraction in th^ Fourth 3 Fifth and 
Sixth Years of Elementary School , " Research and Develop- 
ment Center for Cognitive Learning, Report No. TR-14-9. 
Madison: University of V/isconsin. 

Strauss, S. (1972). Learning theories of Gagne and Piaget: 

inrolications for curriculum development. Teachers College 
Record, 81-102. ; 

Sund, H. 3. and L, W. Trowbridge. (1973). Teaching Science. hy 
Incuiry in Secondary Schools . Coliiratus, Ohio: Merrill. 

Towler, J. and G. Tfneatley. (1971). Conservation concepts in 

college students, a replication and critique. The Journal 
of Genetic Psychology , 118:265-270. 

Trowbridge, L. (197*+). Trends and innovations in junior high 
science teaching in the United States. The Science 
Teacher > 1+1:12-15. 

Tuddenham, R. D. (1971). Theoretical regularities and individual 
idiosyncrasies. In D. R. Green, M. P. Ford and G. B. 
Flajner, eds., Measurement and Piaget . New York: McGraw 
Hill. 

Webb, R. A. (197^0 • Concrete and formal operations in very 
"bright six to eleven year olds. Human Development 3 
17:292-300. ' 

VThile, R. (197*+). Indexes used in testing the validity of 

learning hierarchies. Journal of Research in Science 
Teaching , 11:1, 61-66. 

\7ohlwill, J. F. (i960). A study of the development of the number 
concept "by scalogram analysis. Journal of Genetic 
Psychology > 97:355-377. 

^. (1968). Responses to class -inclusion questions for 

verbally and pictoriaUy presented items . Child Develop -^ 
ment , 39:^^^9-H65. " 



165 



VJolliaan, W. and R. Karplus, (197U). Intellectual development 
"beyond elementary school. V: using ratio in differing 
tasks. School Science and Mathematics ^ 75:593-613, 

V/ood, D. A. (197^4-) • The Piaget process matrix. School Science 
and t^athematics , 7^l-:^07'-'^Hl. 

Woodson, M. !• (197^). The issue of item and test variance for 
criterion-referenced tests. Journal of Educational 
Measurement J 11:63'-6U, 

Zeiky, M. J. (197^). Methods of setting standards for criterion- 
referenced item sets and applications and adaptations of 
classical test theory for application to criterion- 
referenced measures. An address to the Conference on 
Criterion-Referenced Testing, Princeton, 



166 



APPENDIX A 
Pilot Study Results and Calculations 



167 

o 

ERIC 



Pearson Correlations between Pilot Task Scores and Written Test 
and Intelligence Test Scores 



. Lorge-Thorndike 

^P^-*- Task Paper Verbal Nonverbal Total 

1 1.8 1.96 89' 97 93 

2 3.0 1.1+0 118 121 120 

3 3.6 3.53 

*^ .6 1.60 75 65 70 

5 1.8 3.1^8 128 1I+2 135 

6 3.2 2.1^8 111 130 121 

7 2.8 2. In 108 138 123 

8 1.6 2.32 86 101 9lf 

9 3.6 2.5!^ 118 136 127 

10 .8 .95 70 85 78 

11 . 3.0 1.88 107 106 107 

12 3.0 - 103 121 112 

13 3.2 2.16 116 119 U.8 
1^ 1.0 - 88 97 93 

15 3.6 - r 

16 3.2 2.36 101 105 103 

17 2.6 2.2k 103 111 107 

18 l.h 2.56 81 90 86 

19 3.6 1.88 loU 108 106 

20 2.6 - 8J+ 97 91 ■ 

21 2.8 3.0J+ IIJ4 130 122 

22 3.6 3.76 lk3 127 136 

23 2.J+ 3.33 111 117 11)1 
2'+ 2.2 2.56 109 120 115 
25 2.0 3.12 109 112 111 

N Ix A £y z^y Zxy ^ y r 

Task/Paper 21 52.8 l.k^.G 51.6 137.3 I3I1.2 2.51 2.J+6 .35 

Task/Verbal 23 55.8 3.53. 1+ 2378 25266U 6017 2.1^3 103 .709 

Task/Naivei^al 23 55.8 153.J+ 2575 295933 SkSk 2.H3 112 .665 

Task/Total 22 55.8 153.'+ 2ij82 27Mt52 6269 2.I1.3 108 .713 

Paper/Total 20 1^8.0 12k, Q 2186 2kk97Q 3kok 2.ko 109 .6if6 



16B 



Relationships between Task, Paper-Pencil 
ajid Intelligence Test Scores 



Lor ge -Thomdike Lor ge -Thorndike 

Task Paper- Kon- Task Paper- Non- 

Pupil Av, Pencil verbal Verbal Pupil Av, Pencil verbal Verba l 



- 1 


2.3 


1.00 


111 


110 




19 


2.3 


1.25 


111 


111 


2 


3.7 


2.25 


135 


12U 




20 


2.3 


1.00 


106 


97 


3 


3.3 


2.50 


126 


108 




21 


1.7 


2.00 


98 


106 


k 


2.3 


1.00 




117 




22 


2.3 


2.00 


105 


10I+ 


5 


2.3 


h.oo 


126 


97 




23 


3.0 


3.00 


106 


122 


6 


2.0 


2.25 


133 


111 




2k 


3.0 


3.00 


110 


120 


7 


1.3 


1.00 


97 


109 




25 


2.3 


3.00 


126 


118 


Q 
O 


1.7 


0.00 


109 


112 




25 


1.0 


0.00 


86 


92 


9 


2.7 


0.67 


121 


118 




27 


3.0 


3.25 


137 


120 


10 


3.7 


c. • do 


TOT 

±c.x 






28 


3.3 


0 r\c\ 


129 


119 


11 


1.7 


3.00 


123 • 


115 




29 


2.0 


3.50 


123 


126 


12 


1.0 


1.00 


97 


93 




30 


1.7 


0.00 


115 


106 


13 


2.0 


1.25 


00 






.31 


1.3 


0.00 


Ao 
Od 


103 


ih 


h.O 


0.00 


115 


122 




32 


2.0 


2.50 


130 


121 


15 


2.7 


2.00 


125 


117 




33 


2.3 


1.75 


132 


98 


16 


2.7 


1.00 


113 


oil 

94 




3i^ 


1.7 


0.00 


121 


114 


17 


1.3 


0.00 


99 


86 




35 


2.0 


0.00 


91 


102 


18 


2.3 


0.00 


90 


90 




















N Zx 


Z^x 


Zy 




Z2y 


Zxy 


X y 


r 


Task/Paper 




35 80.2 


203 


53.^ 


131 


133 


2.29 1. 


53 .36 


Task/Nonverbal 


35 80.2 


203 


3961 


U56265 9285 


2.29 113 


.53 


Task/Verbal 




35 80.2 


203 


3677 


■J|0227(' 8623 


2.29 105 


.35 


P ap cr/Nonverb al 


35 53. i^ 


131 


3961 


U56265 


6ia3 


1.53 113 


.58 


Paper/Verbal 


35 53. n 


131 


3683 


U01531 


587U 


1.53 105 


.30 



169 



APPENDIX B 
Task Interview Protocols 



170 



I 1, projection of Shadows (Hensley, 197^) [ 



Thinking tested 

Schema of proportions 

Inverse proportions - physical" 

Material 




A screen, 30 cm x 30 cm, is used to observe the shadows. 
The shadows are made by three wire rings, 3»0 cm, 6.0 cm, and 9»0 cm 
in diameter. Each ring has a support wire. The length of the sup- 
port wire is such that the center of each ring is 12.5 cm above the 
bottom of the support wire. The rings are made from different 
colors of wire as follows: 3.0 cm (white), 6.0 cm (red), 9.0 cm 
(black). The rings are held vertically on a meter stick by optic 
bench screen holders. The meter stick has only marks at each 
10 cm length. Each mark is labeled with the following letters: N, 
R, M, K, G, P, A, B and 0. A clear light bulb is supported at one 
end of the beam. The center of the bulb is 12.5 cm above the top 
of the beam. The light is turned on and off by connecting or dis- 
connecting the cord to the 6 volt battery. One meter stick marked 
in centimeters and millimeters is provided for the student to use. 

171 

o 

ERIC 



Introduction 

"Here is a "board, a light and a screen. I can put irp one 
ring (6.0 cm) on the hoard (at 50 cm) and then when I turn on the 
light (do it), I get a shadow of the ring on the screen." 



Question 

Initially seek out predictions of the effects of ring size 
and ring position on the shadow with questions such as; "l^hat 
would you predict will happen if I use this smaller (3.O cm) ring?" 
"VJhat else could change the size of the shadow?" "Hovt?" Do what 
is suggested. 

Culminating Question 

"How might I make just one shadow using two rings? Explain 
why this works?" 



Scoring Criteria 

Stage Criteria Score 

I The subject represents the shadow in the way 0 
the object appears to him. He does not per- 
ceive how the shadow is formed on the screen. 

IIA The subject recognizes that the size of the 1 
shadow depends on the size of the object. His 
knowledge goes no further. 

IIB In addition to the ring-size dependence of the 2 
shadow demonstrated in IIA, the subject suggests 
qualitatively that the distance affects' the 
shadow size, the closer the object is to the 
screen, the smaller the shadow. 

IIIA The subject quantitatively compensates betvreen 3 
distance and shadow size, between distance and 
diameter, but is not generalized as a rule. The 
subject begins to measure distance from the light 
source. 



Scoring Criteria (continued) 



Stage Criteria 

IIIB From the start the subject measures Tdo^ the 

distance fi'om the light source and the diameter 
of the rings. He looks for a numerical 
hypothesis based on the divergent structure of 
the light rays. The subject is able to state 
in a nxmerical form the general relation for 
the tTO rings to have just one shado^^. 



r_ 2, l/^c. Short and Mr, Tall (Karplus and . Karplus , 1970) I 



Tliinking tested 

Schema of proportions 

Direct proportion - geometric 



Material 



Paper sketch of Mr. Tall 
Large paper clips 
Small paper clips 
Chart 




Biggies Sraallies 



Mr. Tall 
Mr. Short 







Big 


Small 


Mr. 


Tall 


3 


2 


Mr. 


Short 


2 





Introduction 

"I have here a picture I call l^Ir. Tall. He measures about 
3 big paper clips, that is, biggies from head, to toe." Measure and 
vrite on chart. "Mr. Small, whom I don't have here, looks just 
like Mr. Tall but Mr. Small measures just 2 biggies from head to 
toe." Write on chart. 

Question 

"Measure Mr. Tall in small paper clips (smallies) and then 
predict what height Mr. Small would be if you could measxire him 
in smallx'js'2 Explain how you got your answer." 



174 



ERIC 



Scoring Criteria 

Stage Criteria 

I Subject guesses, gives answers with no 
compensations. 

IIA Subject qualitatively compensates, "It 
should "be smaller" with no rule. 

IIB Subject compensates through inappropriate 
"but consistent addition or subtraction. 
"It was 2 "biggies less so it's 2 smaUies 
less." 

IIIA Subject quantitatively conipensates. Subject 
works through some multiple or a multi- 
plication factor. 

IIIB Subject states a proportion with numbers 
in his solution. 



EKLC 



175 



I 3> Sled (Piaget, 1970) » 

Thinking tested 

Proportional reasoning 
Direct as square 
physical 

Material 




A 30 cm grooved niler \n.th a steel "backing moiinted so 
that marbles may "be rolled dovm it. Electric stop watch. 

Introduction 

"Imagine that this is a hill on which you are sledding and 
you start at the top and go down like this marble (let the marble 
roll dovm chute, have watch running). Iiiaagine you had a watch." 

Question 

"Suppose, as you called out, each second as you went down 
the hill someone placed a, flag just where you were at that time. 
Sketch how the flags would "be separated. Explain how you got 
your answer." 



176 

o 

ERIC 



Scoring Criteria 

Stage Criteria Score 

I Subject's pattern is erratic or he has no 0 
pattern 

IIA Subject 'c? pattern illustrates some notion 1 
of speed 



IIB Subject shows some kind of acceleration but 
without a constant pattern 



IIIA Subject's pattern relates constant 
acceleration 



IIIB Subject's pattern relates constant 

acceleration and subject states an overall 
rule. "All the time you wo\ild go faster 
and faster." 



EKLC 



177 



Thinking tested 



Proportional reasoning 
Direct proportions 
Geometric 



Material 




4? ao 



T^-ro rods are laid out perpendicular to a numbered 
measuring grid. The orange rod is l6 units long, the yellow rod 
is 10 units long. Then the orange rod is turned to another angle 

Introduction 

"You can see the orange rod measures l6 units. The 
ye3-low rod measures 10. Now, if I turn the orange one, it will 
cover 12 xuiits." 



Question 

"Can you predict how many units the yellow rod would 
cover 'if I moved it to the same angle? Explain how you got your 
answer." 



178 



Scoring Criteria 

Stage Criteria Score 

I Subject guesses. The answer has no support - 0 
"looks like it." 

IIA Subject qxialitativeljr compensates. 1 
"It should be smaller." 

IIB Subject compensates quantitatively through 2 
addition or subtraction. "Subract." Go back 6. 

IIIA Subject quantitatively compensates using some 3 
multiplication or fraction. It should be 
less than 6 difference. 

IIIB Subject refers to a general solution. It is k 
proportional. The proportion IO/16 is the 
same as 5/8. 



179 



5 , Balance 



Thinking tested 

Proportional reasoning 
Direct proportion 
Physical 



Materials 




A light, unequal arm "balance has hooks for eights and 
there are 7-10 identical i^eights available. 

Introduction 

'^Tm vreights Just "balance three on the other side. If I 
add two more on the right, I will have k weights." 

Question 

"Can you predict how many I will have to add on the lef^ 
to "balance again? How did you get your answer?'' 

Scoring Criteria 

Stage Criteria ^ Scor es 

I Subject guesses or has no answer 0 

IIA Subject compensates qualitatively 1 

IIB Subject compensa.tes using some addition or 2 
subtraction 6 - Add up 



180 



Sc oring Criteria (continued) 
Stage Criteria 

IIIA Subject uses a ratio or multiplication 
factor 2=3 so k-6 

IIIB Subject uses an appropriate proportion and 
states some rule:^ 

_1 h'." ling = 3 small ones 
. 3 • ;^ Jigs = 9 small ones 



181 



I 6, Flag Pole 1 



Thinking tested 

Proportional reasoning 
Direct proportion 
Physical 




Two rectangular wooden beams are laid out on a measuring 
grid. A high intensity light source is arranged to produce 
shadows . 

Introduction 

"The green rod you can see is about 8 units long. The 
blue one is about 5. When I set up the blue rod and the lamp, 
the rod has a shadow 10 units long." 

Question 

"Predict the number of units of shadow I would get if I 
set up the green rod in the same way without moving the lamp. 
How did you get your answer?" 



182 

o 

ERIC 



Scoring Criteria 

Stage Criteria Score 

I Wo answer or a guess 0 

IIA Subject qualitatively compensates 1 
"13 It's smaller" 

IIB Subject uses subtraction for a more quantitative 2 
compensation 

I just subtracted" 

III A A ratio or multiplication factor is used 3 
5/8 = 10/16 

IIIB An appropriate proportion is used and a rule In- 
stated 

"The short one is half as tall so the shadow 
vrill be half as tall." 



183 



7. BB Square 



Thinl^ing tested 



Proportional reasoning 
Direct as sqxiare 
Geometric 



Material 




A square 2 units on edge^ a square 3 units on edge, and a 
ruler are set out "before the subject. The larger square has a 
small edge so that it may he covered with BBs. 

Introduction 

"It takes just ikO BBs to cover this small square," Do it. 

Question 

"Predict how many BBs would he needed to cover the large 
square. How did you get your answer?" 



Scoring Criteria 

Stage Criteria 

I Subject has no answer or guesses 

IIA Subject qualitatively compensates 
"10 because it's less" 



Score 
0 
1 



184 



Scoring Criteria (continued) 

Stage Criteria 

IIB Subject uses addition to compensate 
2+1=3 lUo + 70 = 210 

IIIA Subject uses a ratio or a multiplication 
factor 3/2 = X/lkO 

IIIB Subject uses appropriate proportion eniploying 
I some rule 

9/l| = x/lhO About 300, Because it's the 
i area. 



185 



I a. Pattern 



Thinking tested 



Proportional reasoning 
Direct as squai'e proportion 
Geometric 



Material 



A pattern type draving and a larger grid are presented to 
the subject • 

Introduction 

"A small doll sized collar made with the pattern shovn 
uses 12 square centimeters of material." 

Question 

"How much material is there vhen I make a collar like thi 
from a pattern dravn on these larger squares?" How did you get 
your answer?" 



186 



Scoring Criteria 

St age Criteria Score 

I Subject guesses or has no answer 0 

IIA Subject qualitatively compensates 1 
"20 because it^s bigger" 

IIB Subject uses^addition as a quantitative 2 
compensation 
"36 becauce 12+12+12=36" 

IIIA Subject uses multiplication or a XBflo 3 
"3x3=9 1/9 = 12/81" 

IIIB Subject uses an overall rule k 
"It shoiild be 3 X 3 as much because it goes 
up as length x vidth" 



187 



I. 9> Frosting 



Thinking tested 

Proportional reasoning 
Inverse as square 
Geometric 

Material 




A U cm X U cm wood square, a 10 cm x 10 cm wood square and 
a thin cardboard U cm x U cm square are laid out before the subject. 

Introduction 

"Imagine that this is frosting which has been spread out 
just 1/8" thick over this small cake." 

Question 

"Can you predict what would be the thickness of this same 
amount of frosting if it were to be spread out over the larger 
cake? How did you get your answr?" 

Scoring Criteria 

Stage Criteria r>core 

I Subject has no answer or reason 0 
"I don't Imow" 



188 



Scoring Criteria (continued) 

Stage Criteria 

IIA Subject qualitatively compensates 
"It would be less" 

IIB Subject quantitatively adds or subtracts 
"It's 6 more so about l/lU to I/I6" 

III A Subject calculates using a multiplication 
factor ratio 
16/100 X 1/8 1/50 

IIIB Subject uses an appropriate proportion 

16 X . 
' ~' 100 " 173 



189 



10. Paint 



Thinking tested 

Proportional reasoning 
Direct proportion 
Physical 



Material 




A small (1 ml) measuring spoon, some "Tang" orange drink 
and a 60 ml and a 250 ml beal^er of water are set out on the table. 



Introduction 

"If I add two measiires of Tang to the water in ray small 
60 ml beaker, I get a certain color and sx^etness." Show this. 

Question 

"How much water should I add to make the same color and 
sweetness with 5 measures of Tang?" How did you get yoiar answer?" 

S coring Criteria 

Stage Criteria Score 

I Subject guesses or has no prediction 0 

IIA Subject estimateis with some oualitative 1 
compensation 



190 



Scoring Criteria (continued) 
Stage Criteria 

IIB Subject predicts vith some addition or 
subtraction 

" 6 "because 250/60 = k So 2. + If = 6" 

IIIA Subject utilizes a multiplication factor 
or ratio 

"About 8, 60/250 := If X 2 = 8" 

IIIB Subject utilizes the appropriate proportion 
and relates some general rule 
"For the same color it wuld he proportional" 
2/60 = x/250 



191 



I 11 > Speed 



Thinking tested 



Proportional reasoning 
Direct proportion 
Physical 



Material 







/ 



/ 



T 



/ 



A caart is pulled by the experimenter vith a 50 cm length 
of string* A meter stick graduated into centimeters is used for 
measuring. An electric tiiner gives digital readings of time in 
tenths of a second. 

Introduction 

"I am going to pull this cart along* I 'want you to time a 
30 cm run. The clock starts when you push it and stops when you 
push it. Try it. Now do it with the run. Start! Stop! It took 
seconds to go 30 cm." 

Question 

"If I were to continue piiLling it along in the same way, 
how long vX)uld it take to go 50 cm? Explain how you got your 
answer." 



192 



Scoring Criteria 

Stage Criteria 

I Subject guesses or has no prediction 

IIA Subject qualitatively compensates 

"It should be more, about seconds" 

IIB Subject quantifies his approach through 
addition 

"It's 20 more cm so it should be 20 seconds 
more" 

IIIA Subject conscio\isly applies a ratio or 
mviltiplication factor 

IIIB Subject recognizes and states a general law. 
Subject tises proportion, 
"The car is going the same speed so,,,," 



193 



fl-g. Boyle 



Thinking tested 



'Proportio..al reasoning 
Inverse proportion 
Physical 



Material 




Bricks 



0 
2 
k 



Syringe 



30 CO 

20 

10 




to compress the trapped air. Some extra identical "bricks are 
neaxhy. 

Introduction 

"This syringe, vriLth its trapped air, feels kind of squashy." 
Subject tries it. "V/ith no "bricks the syringe reads 30 cc; I'm 
going to add two bricks. Watch what happens," Add reading to • 
chart. "Next see what happens \j±th foiir "bricks." Add reading to 
chart . 

Question 

"Can you predict what reading the syringe shoiild have with 
five "bricks on it? How did you get yoxxr answer?" 



■ I 



194 



Scoring Criteria "^^^^^r- r 

Stage Criteria ' ^ • Score 

I Subject has no reason, loajrbe no answer ^ ' 0 

HA Subject estimates qualitatively . . 1 " 

"It will be less" 

IIB Subject uses some su^btraction for a somewhat 2 
quantitative approach 
"It should be 3 less" 

IIIA Subject calculates quantitatively with some 3 
multiplication factor 

2 X 20 = ifO X 10 = IfO 5 X 8 = IfO 

IIIB Subject calculates from differences using h 
a sort of rule 

"5 bricks means the volume = 8 
Because k/^ = x/lO so x = 8" 



195 



j 13 > Population 



Thinlcing tested 



Proportional reasoning 
Direct as square <i 
Physical 



Material 



o o 



A 50 \init ruler, a square 10 -units on edge and a square 18 
units on edge were set before the subject, 3 markers were placed 
on the 2 measure square. 

Introduction 

"If just 3 COV7S can live on this much grass, 10 x 10 units, 
what is the most number of cows that can live on a plot of grass 
that is 18 X 18 units?" 



Question 



"How did you get your ansv/er?" 



Scoring Criteria 

Stage Criteria 

I Svibject guesses or makes no prediction 



Score 



196 



Scoring Criteria (continued) 

Stage Criteria Score 

IIA Subject qualitatively compensates 1 
"About 5" 

IIB Subject uses addition to quantify his answer 2 
"11 covs, 18 is 8 more than 10 
8 + 3 = 11" 

IIIA Subject uses a ratio or a multiplication • 3 
factor possibly inappropriately 

10 ^ _J 

l5 about 5 

IIIB Subject projects a general rule into the data k 
and uses appropriate proportions 

52 25 ^ ^,3 

92 Bl about 10 

"About twice as large a square has k times 
as much grass" 



197 



Ik. Probability 

i 

Thinking tested 

Proportional reasoning 
Direct proportion 
Physical 



Material 




5 clear packets each containing 2 red and 3 yellovr gum 
drops and a paper bag are placed in front of the observer. 



Introduction 

"Notice that this bag has 2 red and 3 yellow gum drops. 
Suppose you were to close your eyes and reach into the sack. You 
could then get either a red or a yellow gum drop. Suppose now I 
empty all of these into the paper bag." 

Question 

"VJhat chance is there that you wo\ild get a red gum drop? 
How did you get your answer?" 



198 

ERIC 



Scoring Criteria 



Stage Criteria Score 

I Subject has no reason or calcxilation and 0 . 

possibly no answer 

"I don't know** ' 

IIA Subject estimates with some qualitative 1 
compensation 

"It's probably yellow because there are 
more yellow ones" 

IIB Subject predicts with some addition or 2 
subtraction to compensate 
"Now there are f) extra chances for yello\f, 
because there exe 5 more yellows" 

IIIA Subject quantitatively coirtpensates \T±th 3 
a multiplicative or ratio factor 
"It's 2 to 3 for reds to yellows and now 
it's 10 to 15 or the same" 

IIIB Subject quantitatively compensates relating 1| 
a general rule 

"2 to 5 for red and 3 to 5 for yellow. There 
are 2 reds to 5 candies and 3 yellows to 5 
candies. Putting in more keeps the same 
ratios" 



199 



15 > Prilley (Karplus, Karplus and yjollinan, 197^) I 



Thinking tested 

Proportional reasoniixg 
Direct proportion 
Physical 



Material 




A system of t\jo piilleys, one 3" ii^ diameter the other 2" 
in disoueterj mounted on the same shaili are arranged so that as one 
turns the crajik one pulley pulls string in -while the other lets it 
out. These strings pull markers along a meter stick. 



Introduction 

"Hold onto this end (left) while I hold the other (right). 
Nov; notice as I wind the cranky your end (subject) has moved 20 cm 
vjhile mine has moved 15 cm." 

Question 

"How far will my string move when yours moves 5 cm? How 
did you get your answer?" 



200 



ERIC 



Scoring Criteria 

Stage Criteria 

I Subject guesses • The ansv/er has no reason 
or calcxilation. 

"I can't explain it. I guessed." 

IIA Subject estimates mth same qualitative 

compensation outside of any comprehension 
of the task or any rule. 

"When I had 10 you had 15, so \'jhen I get 6 
you should get more, about 8." 



IIB Subject quantitatively compensates with 

addition or sxib tract ion without regard to 

any physical relationship. 

"Zero 20 - 5 = 15 so 5 - 5 = 0" 



IIIA Subject quantitatively compensates -vrtth 

some multiplication factor. Does not seek 
out physical rule. 

"20 matches ^d.th 15 so 5 should match \rith 
about h.'' 



IIIB Subject quantitatively compensates seeking 
out a proportional relationship and a 
physical rule. 

"15 is 3A of 20 — so 3.75 is 3/h of 5. 
The big pulley goes k for the little one's 



201 



rio. :?^er (Karplus^ Karplus and Wollman, 1974) i 



Thinking tested 

Proportional reasoning 
Direct proportion 
Physical 

Material 

On a centimeter and inch graduated rxile, a V long pencil 
is placed. 

Introduction 

"Notice that this length of pencil extends about h units 
on the inch scale aad about 10 units on the centimeter scale." 

Question 

Suppose I vere to put dom a pencil that covered 5 inches. 
How raany centimeters might it cover? How did you get your answr?" 

Scoring Criteria 

Stage Criteria Score 

I Subject guesses. Makes no calculation. 0 
"I .guessed." 

IIA Subject estimates with qualitative compensation 1 

IIB Subject quantitatively compensates throiagh 2 
addition or subtraction. 

"10 is 6 more than h so for 5 I vrould, get 9." 



202 



ERIC 



Scoring Criteria (continued) 

Stage Criteria 

I Subject guesses • Makes no calculation, 
''I guessed/' 

Subject estimates vath qual5 ' ive 
conrpensation 

IIB Subject quantitatively compensates through 
addition or subtraction, 

"10 is 6 more than k so for 5 I wuld get 9." 

IIIA Subject quantitatively coirrpensates vithout 
reference to any general relationship. 
"VJith k iiJs 10 so with 5 it*s about 13." 

IIIB Subject quantitatively compensates iterating 
the relationship of inches and centimeters. 



203 



I 17* Weight i 



Thinking tested 

Proportional reasoning 
Physical 

Material 




Weigh-ts are placed off center on a light rod. Separate 
spring scales measure the freight on each side of the rod* An 
additional three weights are nearby. 

Introduction 

"You can see thaf these scales show how much weight each 
set of wheels carry." Examiner lifts slightly one weight. 

Question 

"Novr, can you predict how much each scale \d2X register 
if I add three more weights for a total of 5 weights? How did you 
get your answer?*' 



204 



Scoring Criteria 

Stage Criteria 

I Subject has no reason or explanation and 
possibly no answer, 
"I guessed." 

IIA. Subject estimates qualitatively some 
compensation 
"About 6 and 2." 

IIB Subject conipensates "with addition 
"5 and 3 because it^s one more" 
"6 and k because it's two more" 

IIIA Subject quantitatively compensates with 
some multiplication 

"It's 2 to 1 so with 5 it must be about 
10/3 to 5/3*' 

IIIB Subject states a general rule 

"With 5 it must add up to 10 and be in the 
ratio 2/l so it's about 6 and 3" 



205 



I 18, Light and Shadow | 



Thinking tested 

schema of proportions 
Direct proportion 
Physical 




A chart 5 lantrp and "niask" were attached to a meter stick. 
The lamp and scrceii. can be noved along the meter stid^:. An 
observation scr' 30 cm x 30 cti has on its smrface a grid of 1 cm 
squares. Light * .ti a bulb goes out through "mask" with a 1 cm 
square hole and iprojected a square of light on the screen. The 
"light" and "hole are positioned at the same height and at the 
center of the observing screen. Markings on the meter stick are 
masked out. Letters note 10 cm marks on the meter stick. A meter 
stick with centimeter vnarkings is nearby for use in measuring. 

Introduction 

"Here is a light, a masking screen, and a chart. The wy 
it is now arranged it makes a lighted square with four units on the 
screen," 

206 

o 

ERIC 



Question 

Initially seek out correspondence between change of "mask" 
position and the projection with questions such as: "What would 
you predict will happen if I were to move the mask toward the 
light? toward the screen?" Do it. "With the "mask" at this 
distance from the light, I get a projection just with four units 
on the screen. What then should I do to get l6 units on the 
screen? How did you get your answer?" 



Scoring Criteria 

Stage Criteria Score 

I The subject views the projection in the v/ay 0 
it works. He does not perceive iiow the 
projection is formed on the screen. 

IIA The subject recognises how the projection can 1 
be changed by moving the "mask." 

IIB The subject suggests how changing the "mask" 2 
location \ri.ll change the projection size. 
The subject may use addition or subtraction" 
to predict seme sizes. 

IIIA The subject quantitatively calculates same 3 
predicted relationship between i^ize and 
location. The subject measures distances 
from the light source. — 

IIIB "^y^ subject links "iuask" location and \ 
pr-ojection size mth an overall model of what 
•fe: causing the change. The subject states the 
riat ions hip in terras of a proportion. 



207 



I 19. Incline (Hensley> 197^) i 



Thinking tested 

Overall schema, of proportions 

Direct proportion 

Physical 



Material 




Welch Scientific Conipany Inclined Plane, Hall's Carriage, 
100 gram slotted veights, veight hanger cord, meter stick. 

An inclined plane demonstration device vas used* State- 
ments of mechanical advantage, angles and distances were masked 
out where they vere printed on the device. 

Introduction 

"I have here a cart -vdth some weights on it. It can roll 
on the incline (demonstrate), it now stays where I put it." 

208 

o 

ERIC 



Question 

Seek initially all factors the subject can suggest • "p/hat 
should I do to make the cart move? What else could I do to make 
it move? Up? Dowil What other things could he 'changed? ^-Jhat 
general rule can you suggest that vill explain what will make the 
cart move?*' 

"The cart is now balance x x rr-^ take off 100 grams, 
what else should I change to again make it balance? How much 
should I change it? How did you get your answer?" 



Scoring Criteria 

Stage Criteria Score 

' I Subject explains the situation in terms of the 0 
totality of the actions which he can perform 
(he pushes the cax up the incline). 

IIA The subject perceives the role of the weight . 1 
on the hook— more wight on the hook, the car 
moves up the incline. The subject does not 
perceive the role of the incline. 

IIB The subject is able to compensate the effect 2 
of weight with- a change in the incline . 

IIIA Subject coordinates the role of the \veight and 3 
inclination. The subject can state the overall 
rule but does not state the proportion with 
numbers or make a numerical prediction. 

IIIB In addition to the attributes at IIIA, the k 
subject gives correct predictions, . states the 
proportion with nutabers, and may use the 
words like its proportions in his explanation. 



209 



APPENDIX C - 
Calculations of Final Test Characteristic 



210 



Calculation of Criterion-Refero^iced Reliability 
for h27 Grade Pupils Posted T-rl- 'he Final Version 

Jv I97U 



\7here 



r = 
c 



O-/ . (X - C)^ 
O'/ + (X - c)^ 



r^ = criterion-referenced reliability 



r = classical reliability estimate (Hoyt, 19^11) .779 
2 

= variance of test scores 20.81 
X = mean of test scores 12.13 
C = criterion level 15 

T = (-779) (20.81) ^^ (15 - 12.13)'^ 
c 20. bl + (15 - 12.13)^ 



r^ = .8U2 



211 



ERIC 



Calciaations of Score Reliability for k27 Grade 8 Pupils 
Tested with the Final Version 
June,, 1975 



Score 


Frequency 


3 


4 


4 


7 


5 


19 


o 


22 


7 


24 


Q 
O 


30 


9 • 


22 


10 


31+ 


11 


31^ 


12 


Ul 


13 


38 


Ik 


19 . 


15 


31 


16 


17 


17 


21 


18 


2k 


19 


16 


20 


10 


21 


33 


22 


7 


23 


3 


21+ 


1 



Mean = 12.13 

SD = I+.56 

Range = 3-2k 
(21) 

Subjects = U27 



SV 



df 



SS 



Variance 



Total 

Among items 
Among individuals 
Remainder 



IO2I+7 2561.70I+8 

23 212.2389 

1+26 385.8585 

9798 1963.607I+ 



.21+99955 
9.2277782 
.9057711 
.2001+09 



reliability = 
(Hojrb, 191+1) 



(Variance among 
individuals ) 



- (Remainder) 



Variance among individuals 



^tt 



♦ 9057711 - .2001+09 
.9057711 



= .779 



212 



Tetrachoric Test-Eetest Reliability 



Grade 5 Pupils 



Non- 
Master master 



Grade 8 Pupils 



Non- 
Master master 



Master J'^^ 

Non- 12.8% 
master N= 12 

18.1% 

N= 17 



8.5% 

N= 8 

73. hi 
N= 69 



13.8% 

N= 13 

86.2% 
N= 81 



Q1.9P/0 100. Ol 
N--= 77 N= 9^ 



Non- ll4-,6% 
master N= 61 

. 56.1% 
N=235 



11.0% 

N= kS 

32.9% 

N=138 



52.5% 

N=220 

»+7.5% 

N=199 



'+3.9^/. 100.0% 

N=l8l4 N=4l9 



^t = 



Chemistry Pupils 



Conrposite Sample 





Master 


Non- 
master 






Master 


. Non- 
master 




Master 


91.3% 

N=136 


i^.7% 

N= -7 


96.0% 
N=l'+3 


Master 


53.0% 

N=179 


7.U% 
N= 25 


60. M 

N=20l| 


ITon- 

master 


N= 5 


.7% 

N= 1 


N= 6 


Non- 
master 


10. U% 

N= 35 


29.3% 
N= 99 


39.6% 
N=13'+ 




9»+.6% 
N=l'+1 


5.»+% 

N= 8 


100.0% 

N=lU9 




63.3% 
N=2lJ+ 


36.7% 
N=12J+ 


100.0% 

N=338 



Cross Tabulation of Test-Retest Results by Reasoning Level 



Grade 5 Pupils 

Test 2 



Grade 8 Pupils 

Test 2 



T 
e 
s 
t 





0 


1 


2 


3 


1^ 


Tot 




0 


1 


2 


3 


If 


Tot 


0 


27 


10 


2 


2 


0 




T° 


25 


12 


9 


5 


1 


52 


1 


12 


13 


1 


6 


0 


32 


e 1 
s 


22 


27 


16 


13 


2 


80 


2 


0 


1 


3 


2 


2 


8 


t 2 


If 


7 


17 


26 


13 


67 


3 


3 


5 


0 


5 


0 


' 13 


3 


3 


20 


16 


79 


35' 


153 




0 


0 


0 


0 


0 


0 




_0 


J+ 


-1 . 


19 


J+l 


. Jl 


Tot J+2 


29 


6 


15 


2 




Tot 


3k 


70 


61 lJ+2 


92 


•+19 



Raw chi-square 55.1 

with 12 degrees of freedom 

Significance < .0001 



Raw chi-square 227 

with l6 degrees of freedom 

Significance < .0001 



Chemistry Pupils 

Test 2 



Composite 



T 
e 
s 

t 



0 
1 
2 
3 

Tot 



2 



Raw chi-square 51.99 

with 12 degrees of freedom 

Significance < .0001 



Tot 



0 0 0 0 0 0 

0 0 0 0 2 2 

0 0 111 3 

1 2 2 58 l6 79 

_i _o __i rr ^ _65 

2 2 k 76 65 l'+9 



T 
e 
s 

t 



0 
1 
2 
3 



Test 2 
123 



33 13 

16 20 

0 1 
6 11 

1 1 



2 5 

6 10 

8 12 

5 82 2k 

1 21 53 



Tot 



53 
3k 
26 

128 

77 



Tot 56 kS 22 130 8J+ 338 

Raw chi-square 295.0 

with 16 degrees of freedom 

Significance < .0001 



Pearson Correlation of Test-Retest Reliability- 



Pearson 

Test Correl, Level of 

Pupils Period Cases Mean SD Coeff . Significance 



5th Grade } ^ H hi -"^ 



Chemistry (11th Grade) ^ ll'.O I'.t '^^ 



composite ^ ^ 5-7 .001 



APPEiroiX D 
Final Paper-Pencil Test 



216 

o 

ERIC 



SCIENCE PROBLEM SOLVING TEST 



Use of the Test 

This test is intended for use vith grade 8 pupils, that is persons 
who are approximately 13 years old. It vd.ll be completed within 
30 minutes by 90 per cent of such pupils • The test may be used as 
low as grade 5, that is with about 9-year-olds or as high as grade 12, 
that is with about iS-year-oldSn Use at these extremes will reduce 
the reliability of raeasxirement . Pupils at the high, ages will have 
scores clustered in the high ranges. Pupils at the low ages will 
have scores clustered in the low ranges. 



Directions for Administering 

Pupils should have a good writing siarface, a pen or pencil, and 
answer sheets with A B C D E answers for 2k questions* 

Test Scoring 

The correct order and answers to test questions are listed. 
Mastery at each level is four or more of the six correct. 



Level I 


Level II 


Level III 


Level 


IV 


1 - D 


2 - B 


3 - C 


U - 


c 


5 - A 


8 - D 


7 - B 


6 - 


B 


9 - A 


12 - A 


11 - A 


10 - 


•B 


15 - D 


lU - B 


13 - A 


16 - 


D 


20 - C 


18 - D 


17 - c 


19 - 


A 


23 - B 


21 - A 


2h - C 


22 - 


B 



Grading master (1) and non-master (0) responses follows this form: 



Preoperational 
0000 



Level I 

1000 
0010 
0001 
0011 



Level II Level III 



1100 
0101 
1001 
0100 



1110 
0110 
0111 
1010 



Level IV 

mi 

1101 
1011 



217 



Copyright 1976 
Orville Ruud 



SClUNCn PROtilLM SOLVINC; TEST 



TEST FORM tf 1 (Version 9) 



Date 4/2:^ 



Directions: Select the answer that most closely is the way you >^ould solve each problem. 
Mark the letter of your answer on the answer sheet in this manner A. ^ C 0 £ 



..ury buys 3 tickets to a raffle where 90 tickets are sold — Jane buys 1 ticket to a raffl 
j where 30 tickets are sold — Sue buys 3 tickets to a raffle where 300 tickets are sold. 

l^ich girls have about the same chance of winning? 

A, Jane and Mary because their' s are- the least tickets 

B, Sue and Mary because each have 3 tickets 

C, All girls have the same chance 

D, Jane and Mary because 3 chances in 90 is the same as 1 in 30 
P.. I have no answer 



2 (IC2) ^ . 

A ring is held between a table and a light bulb. The light casts a shadow of the ring 
onto the table. If the ring is moved closer to the table, the shadow may: 

- ^'^ 

A. Become larger because the shadow spreads out /\ 



Become smaller because the light rays don't j^^^^^^XCH 
spread as much 



C. Stay the same because it's the same ring 

D. Become larger because the bulb is father away 

E. I have no answer 




3 (2Fl) . . . . 

A lunchroom is 60 ceiling tile or 25 chairs wide. If a classroom is 12 chairs wide, how 
wide is this classroom measured in ceiling tiles? 

A. Seems to be 50. 

B. About AO because it has to be less. 

C. About 29 because 60 is about 29 . 

25 12 

D. About A7 because 60 is 35 more than 25 

and 47 is 25 more than 12. 

K. I have no answer. 



218 



H (IOF2) 

lloro is a rocipc for 4 cupii of cocoa: Went to near lioilin^j 4 c. milk 

Add with stiri-inp, 6 T. sugar 

5 T. Cocoa 

llow many tablespoons of sugar would be needed to make 12 cups of tliis cocoa? 

A 18 tablespoons because 6^ 12 » 18 

B. More than 6 tablespoons because there is more cocoa 

C 18 tablespoons because 6^ equals 

4 12 . • 

D. 14 tablespoons because 4 c. + 8 c. = 12 c. 

so 6 T. + 8 T. « 14 T, 

E. I have no answer 



5 (hcjl) 

A car moving at a constant speed of 30 mph will, if pictured at one second intervals, looks like 

A. I because it moves equal distances each 
second 



B. None of these because it is moving 

C. II because it changes 



D. II because it is increasing its distance 5^ 

E. I have no answer 



6 (IF2) 

A ring 3 inches across is 2 feet from the l^ht and 4 feet from the table. The 3" ring has 
a 9" shadow. Where should a 4" ring be placed to make the same size shadow? 

I. ' ' 

A. The shadow will be larger than 9" wherever the ring 
is placed. 

X ^ 2 

B. About 3 ft. from the lamp because* T ...T . . 

and 3x 8 

C. About 3 ft. from the lamp because 2^x 4 =i 2.7 

D. About 3 ft. from the lamp because the ring is 1" 
larger 3+1 = 4 and 2ft. + 1 ft. = 3 ft. 

E. 1 have no answer 




219 



* Y (18Fi) 



Mu::,t feet. he 5 faw ir. : is 2 n;:rc than the " foo; one and 11 feet is 
' -c-: lan 9 f oe 

li. AbcK: ::l feet b cause 3/9 IS 

C. Abe-.. feet because 9 + 5. * 12 

D» About 18 feet because it . Id be about twice as fur 

E. I have no answer 



'.|;h^,- v^itt uvfv a y X V screen :r feet away. To make 




8 (3C2) 

This person sliding dovfn a hill looks at her watch 
r:ach second she puts a stick in the snow. What most 
likely would be the pattern of these sticks? 

I because she moves each second 

B. II because she speeds up" 

C. I or II because she is moving 

D. I because her speed is changing 

E. I have no answer 




. 9 (2Ci) 

!a studcnt»s desk measures about three textbook lengths or 5 pencil lengths wide If a 
' 'teacher's desk is 4 textbook lengths wide, how wide is a teacher^ 5 desk lucasured in pencil 
lengths? 

Tcft 

A. More than 5 pencils because it is bigger than a student desk 
U. Less than 5 pencils because it seems that way 

C. About 't pencils because it was 4 textbooks 

D. 5 pencils because that is what the student desk measured 
n. 1 have no answer 













s 









220 



10 (l2l.v>^ 

tooks on t::|i o[ ihis aiv sr liny coinpiess the npriiii. Tor 2 jou 
For 9 book: it is 1.8 cm. Mial >lu>uUl bo ah;C' vj)ring Kr; 



A. About 5 cm. to 4 ^ 
1.8 cm. rind 8 cm. 

B. About 3 cm. because 



the »-v. 



C, kbo'^t S cjn. because 2 



:causo it bar to b- uh nit half t-twt 

2 books a 1, 8 cm . 
9 books" STO cm. 

2 books - 3 . 2 c m . 
5 books 8.0 c;a. 

i X 8 = 3.2 



^pxin^- is 8 un. long. 
) boons? 



t 

i 



D. About 5 cm. because hooks - 2 books = 3 l ooks 
and i cm. ~ 3 cm. = 5 cm. 



E. I have no answer 



11 (lOFi) 

Jim uses 4 heaping teaspoons of Tang powder with an 8 oz . glas,- of water. How much Tang i 
needed for the same mixture with 12 02. of water? 

A. About 6 teaspoons because 2^ x 4 tsp. - 6 tsp. 

8 

B. About 8 teaspoons because 8 oz.'-+ 4 02, = 12 oz. 

and 4 tsp. + 4 tsp. = g' tsp. 



C. More than 4 teaspoons because there is more water 

D. 4* teaspoons because it is the same mixture 

E. I have no answer 



■'our cars bave different speeds: Car A is the fastest. Car B the next fastest, Car C the next 
fastest, and, Car D the next fastest. The fastest car takes the least time to go 200 miles, 
the next fastest car the next least time and so on. V/hich car is the third fastest and takes 
the third least time to go 200 -miles? 

A. Car C because: 1st fastest 

CAR A 
1st least time 

B. Car B because 1-CAR I) 

C. No car because they don't match up 

D. Car C because: 1st most fast 

CAR A 

^ 1st most time 

n. I have no answer 



2nd fastest 

CAR B 
2nd least time 

2 -CAR C 



2nd most fast 

CAR B 
2nd most time 



3rd fastest 

GAR C 
3rd least time 

3-CAR B 



3rd most fast 

CAR C 
3rd most time 



221 



"13 (8Fi) 

A modiil airplane madu fioin tj 

l^asurcs ^^"^ cm. lonji . Wluit. v.'oiiUl 1j 

a wing made from a pattern with sqi- 

A. 57 cm. because 6/2 x 19 

B. 18 cm. because it looks th-i 

C. 22 cm. because 19 + 3 ^ 22 

D. 19 cm. but the squares wou.., 

E. I have no answer 



.. }i:ilU;-- -Mown 
an 6 cm.? 




:Lrger 



I)* (5C2) 

Trial I 4 people on side "A" balance 6 of the 3arae size people on side *'B" 
Trial II 8 people on side *'A" should balance how many on side "B"? 

A. About 10 becauso 4 more on "A" should balance A -.more' 
on "B'* 

B. About 12 because it ^.oes up 6 ^^nd 6 + 6 = ^2 

C. About 10 because it takes 4 more and 6 + 4 = 10 ^ 

D. About 11 because it should be more 
B. I have no answer 

15 (UCi) 

The •«D" rod here crosses 8 lines. The »'Y" rod crosses 5 lines. The '»0" rod, when turned, 
crosses 6 lines. How many lines would the "Y" rod cross if it wore at this angle? 




A. About 8 because should got lan-^.r 

B. About S because the "Y" rod is that Ions 

C. About 6 because the "C* rod was C> 

D. About 4 because the "Y** rod is 1 ortcr 

E. I have no answer 



1 1 



C 



V V 



1 i r~\ i TT . o 




222 



ERIC 



On the 

weights 

balance 

A. 

B. 

C. 

D. 



1 ini5- trc:ccfl the cart urn) its \ ' ir;>t is balanced hy 
tho s:,)rt:ii:. Whut amount .ol wo: :ht y needed to 
J g of xart V jxpht at 20^ ? 

« 

33 becz .5C x 400 = 133 

500 

130 bec:ij;> - . t is more 
:77 bec:i:isr It goes up 17 for ev-rry j 



133 becairsts 230 = 133 
i>^BO 400 
E. I have : r answer 



Angle 
10° 
10° 

20° 



/Weight 

Cart Strin^i 

300g S2 
300g 100 
400g ? 




I ' "^^^ ^^^^^^^ - ^ — "o- far Will it have traveled 



I by the end of 5 seconds? 

A. About 264 feet because 3 x 88 ^ Zfy 

fi. About • 100 feet because it is only Z seconds more 

C. 220 feet because 88 x 5 = 220 

D. 91 feer because 3 sec. + 2 sec. = 5 s^tx. 

and 88 ft. + 3 ft. = 91 ft. 

n. I have t:o answer 



. 18 (IOC2) 

i Here are some recipiis for Kool Aide 





1 quartv 


4 quarts 


S qaarrs 


Kool Aide 
Powder 


J pkg 


2 pkg 


? 


Sugar 


%c 


1 c 




Water 


;i qt 


4 qts 





:?trw muctr. powder is needed for S quarts of Kool Aide 
A» 2. pkg becaiise it is the same mixture ' 

B. J pkg becsnse 4 qts 1 qt = 5 qts 
2L-d 2 pkg + 1 pkg = 3pk2: 



About 3 b 



se it would ha.v?e to h^. more 



D. 2h pkg because 4 qts + 1 qt = 5 qtHi 

and 2 pkg + h pkg = 2^5 pkg 

E. I have no answer 



223 



ERIC 



"19 

• A freevay vex^ keeps track of tlu? distanco ho travels, lie finds that ir 4 minutes he 

travels 3 soilcs/ in 10 minutes 7^ miles. If he continues at this speed, how lone will it 
' take hm to travel 10 miles? 



^ 13 1/3 min. 



Distance 
3 miles 
7^3 miles 
10 miles 



Time 

4 min 
10 min 
? min 



A. Abioiu. 13 minulius because 

4 min, 10 min. 

i miles " 7,5 miles To miles 

B. - 13 minutes because 10 - 7^ - 2h miles 

and 10 + 2131= 12^5 mih. • 

C. tU)\j\t 13 minutes because £ x 10 = 15 1/3 

3 

* D. fe-ut 14 because 7^5 + 5 = lOJj 
and 10^+ 4 = 14 

E. i ;ave no answer 

, 20 (9Cl; 

; Imagine that frosting had been spread out k inch thick on top of a small 6" ~ 6" calce Pr^rUr,- 
'Ske? "^"^^""^ """''^ "■'^'"^ °f f'-'^sting were spr.ad out o^^er a 12^"x 1^' 



A. 


More than 


k 


inch 


because 


B. 


L::ss than 


h 


inch 


because 


C. 


I-ess than 


k 


inch 


because 


D. 


More than 


h 


inch 


because 


E. 


I have no 


answer 






These ~a:ture hunt groups are chosen for a nature hike, Mrs. Andrews . 5 students 

Mr. Denton [i 'irr^. Frelk - 8 students 

Mr, Holt -. 6 students 

llie teach:cx wixh the most students to help is: 

A. Holt because 6^ is larger th.in S is larger than 8 

1 1 2 

B. lar.. Denton 6 Mrs. Folk because 1 is larger than 1 is larger than 1 

8 5 6 

C. ;fcr,. Denton 5 Mrs. Felk because they have the most students 

D. fcrs\ Andrews because she has fewer students 

E. I have no answer 

224 



ERIC 



.Sketch of a house is 5 pencil widths or 2 pennies high. .Sketch .^2 of this house is not 
l^ihown. Sketch «2 looks the nan;c i:uc ii; 8 pencil widths hi{ih. How hi^jh must sketch il2 be 
in penrjics? 

A- Ab-.i:t 3 becau^^e 8-5=3 

B. . Ab m: 3 because 2_ „ 3.2 

S C 

C Ab.jLit 3 beca-^e 2 q = 3.2 

C. A!?out 3 becc^use it lias to be lOre 
E. Z have no answer 




Ulcere// 7 




D 



\ 23 (ic_,:) 

A ring iz held betwee- s table ard a lifhf KniVi •n,^ i- i - 

If a smaller ring was r-^d in JhrLne ofaL t^o' ^J^ulb casts a shadow of the ring. 

iT-.a ^n tne sajne place the shadow of the smaller ring would 

A. Be smaller because the light would oian^e 

B. He smaller because The xirrc is smaller 

C. Ee the same si^ic because the rixig is in the same place 

D. 2s larger becaj::::^.- it is cUfferent. 
I have no answer 




; 2h (17F1) 

Jane is weighing out apples on this supermarket scale.. ;.7iat will 14 apples weigh if 6 apples 
weigh 2 lbs? 

A. 10 lbs because 6 + 8 = 14 

so 

2 + 8 =10 

B. 3 or 4 lbs because it is mc~ 

C. 4/3 lbs because 2_^ x 14 = 2/2 

6 

^' i because 2 +2 1 = 5 




E. ^ have no answer 



225 



ERIC 



APPENDIX E 



Pupil Resiilts and Test Improvements * 
in Versions II-VI 



226 

o 

ERIC 





VERSION III [Test Forms 7 1 changed frotn VERSION II 



I4C, 

cor feci- 



VERSION' II VERSION III ''")^* ' ^^^^^^^ " "^^^^ ^''"'^ ^'^l*''^^ "''^ ■'^''^ ''"^'^ ' ^''"^^^ ^''^f* 

mMmm mmt^ ' khcio lH tlckfiU arit ID]*] Soo buyn ] lickcti 10 a r»fflc where M ticU'ti m lolJ, 



All 


0000 
36 


1000 
10 


All 
7 


0000 
IS 


1000 
0 


1100 
39 


irhich filrl) h^vc nbut the ibm chsnu or vlnniitt? 
A. Jine mil >\»i htzmt arc \k lean ikUti 


S 


ID 


IB 


10 


10 


24 


6 


61 


0. SiK iikl ^tiry bccauic cscli h>vt 3 ticUti 


X 


10 


9 


10 


15 


24 


0 


0 


C, Ml llrli \im iho idi»(> (^ance 




62 


36 


70 


63 


24 


94 


0 


( D.) Jane ^bry because 1 * 1 
U ^([ 35 

E« 1 hive no mwr 




0 


0 


0 


4 


8 


0 


0 


c 



CHANCES (11:9 rpsponiios and the question dppcar appropriate.) 
None 



lie, 



VERSION II 



VERSION III 



All 


0000 


1000 


All 


0000 


1000 


1100 




62 


9 


90 


70 


40 


98 


100 




3 


0 


10 


6 


14 


0 


0 


21 


55 


0 


14 


27 


2 


1 

0 


10 


27 


0 


6 


10 


0 


0 


3 


9 


0 


2 


4 


0 


0 



01 bcciu5& It NYci cqtul ^litincci eich ^ 



I. HonD of ihcie facMu» It it iiovin{ / 
C, II bcciuto it ct)in(ci J 

til bcwwc It li Incrciilnc iti dijtatiw ^"'Ipr 
E. I fiivi nff amwer ^ 

CHANGES 
None 



9C 



VERSION II 



VERSION III 



All 


0000 


1000 
-iO- 


All 
-X- 


0000 


1000 


ItOO Hhit the 
wkct 

* 

It , I 


0 


0 


^0 


2 


7 


0 


0 1. 


69 


5S 


70 


B3 


SB 


98 


100 . 0 


14 


2S 


10 


9 


20 


2 


0 D. 


7 


9 


10 


3 


6 


0 


0 B. 



I Lrii thin \ tnth beciuie Jt covon horc eiic 
llore than \ inch becsuio there ii lore cnke 



J 

/ 

0 



mm 

None 



ERIC 



4C, 



VERSION n 

All 0000 1000 



esioN III 
Ail 0000 looo udn 



0 


0 


0 


5 




, 0 


0 


10 


27 


0 


10 


21 


2 


26 


7 


18 


0 


4 


10 


21 


11) 


72 


27 


100 


72 


43 


71 


6S 


10 


27 


0 


9 




fi 


n 



L *T rdJ hrrr tm^n I )ine>. Ue 'T t^l cfOJf" B lino. T t^*^ 'i:?^^-*. 
cronseit (i liii.'S. U BJiir lln^'i »*viilJ the "V* CW" iMl w.'^ t\ t!it* rji;l.*, 

I 



, A. About % kcitise p ^ ^ ' 

h About 5 liN'Miue the reJ ii that Imf 
['C.i Alitiut 6 1'ccimse the "0" roJ waj (. 
^ Ahoiit 4 k(.tiije. ihc "Y" wJ it ihorter 

B, 1 huvc ii<^ 7ii)ver 

CliANGES 

Notjp (lifnnij} licy) ' ; 



0 



"-?-m-l 



/ A i C J- < 7 f 



,7 



0/ 



■ I i i 



•ve-rsifti III ' iesi forms m ^ as cl§nqecl from version ii 




2Ci 



V![|J5I()N 



l!m li tltctrh '1 of % pjpcr JoH. 51tct:)i ii 10 pell vidthi or 3 
Alt 0000 1000 i^iurUr: Mjh. S^clth 'i if this pqicr doll ii not sho^fn. Sketch H loob 

nil uuuw jwuw the 1430 but U W pcft^l wl'Jihj hlth. Il«w hith wit iIc«cH2 be In qjuJcn^ 

45 9 60 ^^Hori Uian 3 qurton bccjun paper do)) 'J ii larjir 

l« Fwif quirieri beciusi It leoi that Kif / 



17 



0 



0 



30 



C. H quirtori Veeiusg It ii 14 penelli . ^ 

D. MM flinbtr of quimri ilaci lU tht lao ppsr doll 3 
t, Duvtssmuir 




All OOOO 1000 1100 



94 



fiorc thin 5 Ueiuic bi^c^r than i ittirfcnt AiL 
. i« Uiv tht S brciitfe it:iQfu that hi/ 
. C. .bout < UctuSQ it Mt^4 tutbooli 

Bi ) liAVf wmkit 



3 ■ 
J 



< 



(MGES 



REASON 



Student des)( and teacher dosk compared' in place of paper doll. 



More faiailiiir, Wish nore sticces with this ita. 
Students asked where was the other paper doll. 



Siapler integer ratios 10/4 becomes 5/4, 



More appropriate to the problem. Intend a siapler prcihleB. 



VERSION II 



VERSION III 



ICi 



All 0000 1000 



All 0000 lOOO 1100 Arlii|:is!iol(n'CKe(h!jUblcnrJnl]r.hthulb. Thp )i);lit Mb mti i .tliaJow of the rins, 

\\ u Mjlltr ilii!. is lu')»l 111 thr vn I'l.ijc iW ^hIlllwl of the HuiDcr T\ri<^\i 



2 3 



A, Ic si;illcr lifcauw the Hfli; m\\^ cbintc 
r. Pt larccj liccfluje ii in dlffcivnt 



/ 



C.I llic sm- tljf hi'fjuwe ilip riiit ii In the iwc itlac* J 
11.) It wlliT liffauLfl ih: rjfi;; 1> jrallcr 




llcsponsts appear approprlata 



ERIC 



CHANCES 

■ None (Wrooji key - Version III) 




VliRSlON ?. 




Vi:[»Si()N 3 



I4C 



6C. 



VEASION II 

Air 1110 1100 
52 



VERSION III 



10 



5 



0 



20 



30 



10 



ttioiB naiufo Iwfflt tfwpi ire (hoicn for i mm Mk^ Hri, hMm 

Mr. [WittOA I 

All UIO llflO lODl) ^ , 



« 8 iNcfitl 
Mr, DiNiton I Pri. F«U • I itwlcnti 
Mr. lloU > 6 lU^enti 



10 



2S 



10 



45 



67 0 Hr. HoU bccwii 6 it lircor thiin S ir lirccr iKin a 

I. Hf, Unm { Kri. Fflk btciuio 2 ii lirsir tliii) 1 li Ur(er thm ) 

ill 

iL C. Hr, Um ( lirii Fcii bociuu they hivo ihi lOJt itvdwti 

0 0, Hri. Andr«wi b«ciui« ik hn fewor itwlenti 
2 

■ ■ 6^ I Hivt no iniHer 



/ 

0 



QIANCES 
None 



REASON 



Appeared satlsfnctory • wjintcd ond got ubout S0\ juccc», 



VERSION U 

All 1110 1100 
S9 



10 



17 



0 



60 



10 



VEHSION HI 

hw COM hwd Jlffcrnii tpjcJv Cur K li the fmicsr, Cir B ih$ m\ faiten, Car C rtc nnt 

All 1110 1100 1000 tf« '^ fV''**'^"'^'''''''^ ^1^^^^ 
Die third lent tlna ro !00 Ml)o!t? 



53 



16 



57 



37 



00 



10 



CHANGES 



. B. tin 

, C. HocnrlttiaiiiotheydonUMldiup 
K 

fi. Cir C Icciuio: lit noil fiit 
hi luit tiRi 



REASON 



B ' 

Idil ho)t flit 
I 

hi m\ tiio 



Iti u|t flit ,9 
3rd Doit tlBi 0 



Appeared satisfactory - wanted and jot about S0^ success. 



All 1110 1100 1000 fl^llIr|l2^oot^ll||»lehlMlh^J«3Sf^etlw^ llwlwpihj(|[)^'«miU^^ 
'hill? ' 



All 


lltO 


(2 


33 


IT 


67 


U 


0 


0 


0 


3 


0 



10 



0 About )& fttt beciuto: rill polfi 
f%im 



^ . I J About I ftet beciuio the ittm ii lesi than !i ii bi| ^ 
C. About 7 flit bieiuie it ihwid indiiii lUi thi (|i( poliJt 



i, ttoul (1) rE«t Uom It mu thit mf 
I, I Im» m m« 



':■ OlWas (Vr«i{ kg/) 
A Uitd t)i» nuiben 20, 10 ml S to' jlv» «cognl:ablo wltlplcs, 



/ 



All UIO 1100 1000 



REASON 



-V- 

A :o ft. flit poll hai I ihidov J3 ft. Ions, A 10 ft. tw hii j iJjjJ^y JJ ft Jqr{, 



38 


30 


20 


68 


28 


27 


6 


17 


12 


33 


16 


4 


11 


10 


4!i 


2 


11 


0 


6 


8 



lloM loii( I iliiJoy MDUSft, pcrion hivcT 

.0 



Abotii 12 ft, bctme 31- IS 1 2$ 
ia4 » • n i i: 



B. About 12 ft, hrcAulo it ii b);(er (Hin the m 

C. About 24 fi, bctiuso the kui ii S ft. leii^ 

D. About Id f{, bcduie tt Kmi thit Niy ^ 
Gi I hive no *mct q 



1, Wished Borc appropriate level • Version H has too hard. 

2. Wanted a correct answer obtainable without fonul, thought » 





5C, V!il»5lHN 

■'■ m Mtft llOfl 1000 • m >l < * ^"^^ ''n'''li*S 

Ml UIO IWO itfW JJJ^j^jj s[caflc«»ldeT«aJhnccho.wnyonilJfi'rf 

A, Aboul ? tow one wrc on "A" »huulJ h\m m wrcjf n t 

0 Ab«ut ? bMwn 6 li I en ihjn I ' 





5Ji 




50 


i 






0 




53 


1 


:o 




0 


0 


:o 


1? 


0 




30 



C. Hboui 1 bfcwic 4 ♦ 1 • 5 •iiJ 6*1-7 

D. Abcut 7 bc(iti}( It }hog)il loro 

E. thtva fid tnfvcr 



/ 



ami 



'l, Orljinil cnJiticns vl:: :-3 viero ctiangoJ to 2-4 
•1-6 were changed to 4-8 
5-t «ete clian£v'il to 6-? 



3C 



All 1110 llOO* 1000 mi EitH "Ji^ ■ "icl^ ^f** NS\ 
Vhat ivil lUMy U thi yjttfm of the« nickilSi^jU 



:i 


33 




20 


i: 


0 




:o 


:i 


0 




10 












67 




so 






0 

z 




5 


0 




0 



A, . n btciui ihv trivili f ich ifcond 
I. Hi >«tiu» It i» I it«tp Mil 

11 

Ul AT 



3\ 



■01," 



b<ctu$B htr ipccd it ch»(iK( 
E, I htvi nt antvcr 



/ 
0 




1, Only tweunpUi wti In Vorsion III In m attempt to conccntiato on rciiionj. 
VoubuUi}- chin^e from trawls to mts, ■ 



IC, 



All 1110 1100 1000 



7 


0 




0 


10 


0 




0 




0 


w 


0 




ICO 


2 




17 


0 


30 



)l Tin U held l^ctvetn i tible inJ « lUht bu)l>. Uto Hnht eau^ a iliii<}o>.' thr rlni; miiD 



Utm Ursn if thi ri^it is tloier to the isbU 
I, iKdM duller If thi rin( ii cloifr to tho llsht 3 
lfti\r\ thf mi tlit f«(jrilli-i> of ikj« ilu' ritit 1< |it.uTj^ 
I l««e Urtfr i( th» rlnj l> wvtJ do^r to the Hjht 
E, I liivi no tmt , 0 



o 



K Revoniing ot* i]ue5tii?n ;^tcn fron "A ring iii held bv'tveen a tiibk and a llfiht bulb" to 
, "If the llsht U aovtJ clo«r to the table", 

;!;;; ;:g[^(^"nj of answer and dUtracters to afford an answer In tenns of a physical tu}dal, 




3 



Trill r 2 pooplo on ilJo T btloiico 4 of tho ir-9 p^f/.lfi m tili T 
All 1110 llOO 1000 Ttlrlll < people on jWc "A" bilnneo 8 of thi j3'9 ii:o r" 'iB oiiU'e T 
' W»l III 6 pooplfl on sldo "A" ihotJid biljncc hw f Jiiy tm'A 

,A. About 10 bccsuM } uro m T itould failinci tvo mv, 3 t\ ^ 



8 


7 0 


10 


S2 


75 100 


20 . 


17 


0 • 


62 


16 


10 ^ 


4 


6 


10 • 


2 



0 About 12 bKiuii it {oii up Und I H > 13 ^ 
, C. About Id beeiuie It ti)(ei 3 Boro W)d ( O ■ .10 X 
^ D. About 12 bi'tatita 'll ^bould b< wo / 
£» 1 hive no inivor (9 



REASON 

1. This allowed a correct additive solution since tho problem's difficult/ was 
hypothesized to be a result of its use of ratios, Fom 11 was- too difficult. 



All 


1110 1100 


000 


000 


Thtt porion t\UiK > l>^ll ^'^^^ 
Sicli iccofld ihii puti 1 Itltk in the tnoi/. t'kit t'/. 
lUflly Muid bi tho pattern of \\m nitUi 


t 


10 


0 6 


IS 


'15 


. ■ ' . A, I titciuU iho *m\ 006 lecMd 


J 


12 


10 0 


17 


26 


1, II bOCAUSC U ll 1 itocp hill 


/ 


12 


3 3 


S6 


4 


Ct 1 or II bcuii» iHo It fi^vlne 


I 


60 


87 84 


5 


43 


bcciVN bor ipeed ii (buiclrc 


f 


6 


- 6' 


4 


11 


t, I bAvo no mv» 


0 




mm ' ' . : 

1. (fished to make this question more easily conprchended and answered on the basis of rtum ^rj 

2, Stiidont uskod obuut tho travel ing* ■ 'fii 



All 1110 1100 1000 
21 D 2fj IS 



A rl(i[ ii hold bo!«Dn i t«b)c md i Jitht bulb. Tlic \U'\ (iMi » " of ''r 
onto tbi tiblt. If tliB rint 1j wvid cioicr to i!p till'., tht iHiivr 



69 


97 


74 


67 




0 


1) 


0 


7 


3 


0 


12 


2 


0 


0 


'6 



hi locolo Ureoi ht^m ilntiow jiprci'j} vut / 



0 



SotMO iiillir bocaiijQ tlii lijht T)yi (Im»i 
ipre)d IS Ruch 



C. Sti/ Ik ^hM Lc^nr* It'^ tk w-'i r'^n ^ 
0» iBCDiic Jarcor bcMuw th« bulb ll f«l,tr ,wfi/ J 
e. I hivp no inivor Q 



0:| 



REASON 

I, Hiah tu reduce onbiguity of Hhat is dosired. 



234 



2, Identification of a iDodel is appropriate to this level.' The previous answer dcpeiidei 
prlwnrily nn tho oxporionce of tho .iNont, • ' . .v^' ^S? 




viiRSiON 



VliRSIDN .1 



llth 



mo 


UOO 


Ml 


1111 


1110 


UOO 


3S 




12 


17 


0 


0 


35 


I 


S7 


52 


75 


0 


0 


0 


0 


0 


80 


33 


V) 


1} 


17 


11 


20 


0 




16 


9 


14 


0 



n yivjnivf i^ni iprCMl Kjht out ever I J" If 3' icrrcfl I htt » 

ipr»J over . 5. 1 s- ,er... r» b.a ..,t tho .cr^I^/.o"!/ 

A. Atttul 10 ft, )hi 5 ft llip li 2 5 
■cw than the J fi o(i«, so ilio S ft ^ 
lM|i ihoj}j bQ 2 (\ MID bick, 

,0 About IJ ft bwuji 5/3x1-1] 1/3 y 
C. Altfut n ft biciusf I O • 11 / 



All nil inn nnn V^)*'' ^rrfsJj UHlfht out over »r xy icw^ xbwu 



Abeyt IS fi b«eii4f It ihouU bi ibwi 3 
iwiM II far - 



Oijtwctir A • Froa approxlMte nuabers to nore ej(plicit 

« * . o 2 "O'* thM 9 feet, 

HeiOYal of Abbroviations: im ft. to f«t. 

Froo 5/3 X 8 • 13 1/J to 3/9 n S/15. 

Ohtncter.,D -.repoved. 




17 


,„,5 


I 


35 


52 


90 


97 


49 . 


JL 


5 




10 


1! 


0 




3 


6 


0 




J 



D. About U feet bcMusB ii »houlil be ib«>t ivlct 4j fat J 



E, I hiYc no insver 




1. I wished to increase the plausibility of Vie answer. 
Version 1 was confusing students, 

2. Abbreviations could cause confusion, 

3. The comparison of the ratio was intended to wkb this easier and closer to thii level. " 
IsHln!"'"'^'' ' ^'""'^ inappropriately be attractlnj fomi 



I7R 



8th 



All 


1110 


21 


0 


10 


0 


52 


100 


10 


0 


7 


0 



llOO 



llth . , ■ 

All nil 1110 UDO All nil uio noo . J;;;*;;^^^^^^^^ ^^»t.iuw.i7Ui«uuf6i?pi*. 



92 



0 



0 



0 



0 



26 



A. Ibi because 6 * I i u 

10 

1)( * a i s!i 

B. 3 or < lbs btciuie It ii oori 



J} (Cj l\ lbs bcuuso JU X U • 3>| 

_ 0 fit 3 or 4 Ibi Uimt It looks ihinwy f 
_0_ 8. 1 hivo no iniN(r ^ 




None • Performance was appropriate 



I IF 



Bth 



llih 



QtANGES 



All 


1110 


noo 


All 


nil 


1110 


1100 


All 


nil 


1110 


A 

noo by 


24 


0 






0 


B 


20 


18 


!; 


fl 


42 ' 


10 


0 


0 


0 


0 


0 


4 




0 


0 


4S 


100 


2 
S! 


BO 


.100 


86 


80 




.OS 


ftp 


52 


10 


0 


0 


0 


0 


0 


7 




3 


3 




0 


0 

z 


4 


0 


6 


0 


S 




7 


1 



A, lloro ihn ktmc It ii still B^'vln; A 
11. Uii thin igj feci bocausc it ii o^ly 2 Kconds mt / 

9 



Di iOO feci kcjiMc 3, sec. ♦ 2 • 5 
!>• 1 \m III' iiiiiiwi'r 



J 



None » Perfomanco was considorod appropriate. The itea is a good dlscrininator, 



236 




■■■■■:■;!■ 



I OF. 



VIEHSION H 

Grade 0 

All 1110, llQOi All mi 1110 1100 



45 


100 


28 


0 


21 


0 


3 


0 


J 


0 





96 


92 


20 


I 


0 


0 


0 


9 


4 


6 


80 


0 


0 


0 


0 


0 


0 


0 


0 



CHAKCES 



I mi 

Grade 8 Gw<ie H 

All 1110 1100 Jl nil 1110 1100 



0 S 



0 0 



0 19 



0 3 



20 



k M lUpIsnnuiQE nzit tm tho psttcm icitures 
pij £8. Ions. niit'Muld lie the Icntth of swK i viK 
tm I pitttn with iquatt) thtt in.} tlflof ii lonj 
3 tiMi O ^idif 

A. Abeyl Jsfl. lecmwit iMki that vir ^ 



0. 9'i n. but tho i^uiroi would bo lir|ftr / 
B, ! )tm to lAtvor ^ 



VliRSION 3 



Ji« usci 2 hffliilnj \\\\'^m of Twb H^t vlih B oi, (Ini of wtw. *«N T»{ l''' - 
AU nil lUO 1100 nccJcilforlhPMio«UturcviDi3?ou Bfwaicr? ^ 



49 


6S 


57 


M 


17 






45 


20 


15 




3 


5 








10 









is Atout 21 tlllpPOlU bOMUll 37 Ol n 
*\ Ol ind 2 up. « 19 ttp^ > 21 tip. Q 



B, I hivi no utiver 



/ 

0 



Hio iten intsA to be Na:xin{ appropiiiitely. 





"1 
















1 









A K^ol alrflano wiii;3iij frerlhO-Lcft, pattciti >h«n 
All UU UlO 1100 B^Durct 7 (0, lonj. kliat vo-Jld be i)ie.le!:;t:i of titcH 
1 Klnc Biie froij a piacrti wiJb Knorci' thr: aio 6 a,t 



nTTTU 



27 



22 



25 



20 



20 



10 



26 



32 



^0 57 «. bcciuso i/2 X 18 ■ S7 ^'^ 
it, II at, bouuse it loo)ii thii^ui]! J 

C. 'i7 CM, becjwc 19 ♦ 3 ■ 22 im 

D. 19 cB. but tht :iquir«t Wldbo Ursn / 
e, I hivi no MMtr 0 




1. Stco was written Nith ficasureocnts rather than tho nultiplo. • 

i Version 11 - .../squares that arc throe tines as long and Version III • .M."the 2 cb. pattern,,." 

2, Ansvers distractors essentially tho same but more intc^raL values. 



REASON 



1. The fornal roosoner should infer the nultiple rather than just idmliy it. 

2. Students BSked questions about answer. It vas intended to. sake this question urv 
discriminating. 



2F, 



Grade B 



Grade 11 



CHANGES 
. NCMIB 



All 


1110 


1100 


All 


nil 


1110 


UOO 


10 


0 




I 


0 


0 


20 


10 


0 


2 

\ 


I 


0 


0 


0 


3S 


67 


83 


100 


92 


40 


31 


33 


1 


0 


0 


0 


10 


0 


0 

'A 


8 


0 


8 


40 





fl 


5 


A 


17 


0' 


3 


23 


43 


100 


87 


10 . 


22 


0 


3 


55 


5 






6 



Horn in y\M\ W «^ in ilri>Unp. Slctch »1 il ) \^\^\\ <iM or 3 rfr.lcJ Mr^.. JU•;c^ 



All nil lUO UOO \H air|il,iin; i> rot shflin, Slotch rj leoU the J*ac but is 1? ptll Hw 



t, About 7 bc»uic it hs} to br loro 
About S beciunc 3 it about $ 



About I \>mm 12 Is S tu» thu 7 
111 5 Mro thu \ 



C. I hire no antktr 



RliASON 

Appears to discrltalnnto Molli. 




Mi 




Glide 



3 



Grade II 



All 


1111 


1110 


All 


nil 


lUO 


10 




33 


24 


4 


36 


iL 


1 

Q 

S 


0 


•t 


4 


3 


38 


0 


3 


0 


6 


31 


VI 

S! 


67 


57 


87 


44 


10 


0 

z 




9 


4 


6 



aiANCES 
None 



i7ivi5i=; 171% 



Crde B 



Grade 11 



A ISO pouid «n itJnainn m on iho wd of i dlvlni.bond bmdi tlia end of tlio bonrJ Jowj 
All 1111 1110 ll All 1111 lllO Hnchc*. Ilo irtd hU 200 |wwd coopaiiion (totaUf 350 iwjnJj) bend it 21 M^^^ 
" viil tin board bend i*llh only the 20D pwid jWHonl 



69 


t 


::qo 


48 


35 


50 


JL 


C 


0 


12 


9 


19 ' 




i 

Hi 


0 


I 


0 


3 


10 


0 


35 


57 


22 


3 


0 


0 




0 


6 



A. II helm beeimi IN 9 > ^ 
I. U inchii bieiu)« 9 x300< 12 3 



d. lJliicheibcciu»iVlilnbiWew2Ud9 ^ ^^^^ 

0UlJichiib8ttiui 9 • Jl« 12 il 
IS 3SJ r 

c 



piANCES 

Replace the lte>. 



IIF./IOF; IIB 



Crade B 



Grade II 



■All nil 1110 twi nil iiio {iJJI,!^'^*"*'''"**^'""""**^"'^^^''*^''' 

gKWdi Peel 
0 , 0 

a u 

S 2211 



1}L 

'U 


i, 
i 

0 

z 


33 


31 


17 


39 


0 


5 


0 


6 ■ 


17 


67 


31 


26 


33 




0 


31 


57 


19 




0 


3 


0 


3 




39 



Ho>^ lonii i\ uU the eir to tmel «0 feet? 



A. About a leconds becwse 220 ft. S xci. 

8S ft. 2 }iei. 
8S ft. 2 teci* 

, • 357 ft. • 51eM, 

I. About 9 leeondi beciioe It should bo Doie 
C. About 9mon(iibicigii»x9*S9S 

0 About 9 iieotidi beciuit 2 or S it ibout 9 
2i 1 hivi no mmr 



/ 

J 

i 

0 



1 nil HID 



20 


2S 


20 


11 


0 


3 


13 


0 


17 • ' 


34 


75 


50 


ill 


0 


10 



VliRSlDN I 



On ilie rvif illiiitraieil the cart ind Itt woi^bt ii bilvieid b]r 
v)i|)tti on Iho iiritijii Ithai v»uflt of velcht is nctdCil te 
biloncc 400 c of can wci(lit at 20" T 



A. U3 bc»u»e 100 x 1110 » 1S3 

&. ISO bociltve it il voro 

C, 177 beciu» It joei up 17 for %\%ri 100 



,0 



\\\ become. iQ9« 13) 
lit I Uvo no insMtY 



3 

/ 

X. 
0 



tm _ 

A«tl< uri ,\uiM 
Itr W| $: 

2o; % lea 

20^ 40C( t 



REASON 

The itoa oppeara to be working approprlateJy 



I5R 



A fmiij driver kecpi ma tf tfio Jliu«cc b> iravdj. lb rindi (hit tn » ilinei k 



la 


30 


17 


13 


5 


.■J 


2S 


0 


20 


3B 


65 


50 1 


5 




10 



A. K)m 13 kinutei boeause U 10 » IH 
I I 

I. Abouc 11 Dinutei beciuie 10 - 7lt • 2^ lilri 
I 

C. About M bociuio' 7li M • 10^ 
10M« U 



!l2i !■ 

< kit. ftV 



rEuei 4 kit, 
J l»l illej \i Bin, 

10 kllAi 

/ 

a' V 



B. 1 hm no iniucr 



REASON 

Hie item does not 
itti(liiircd response. 

lOF. 



All 


1111 


1110 . 


llo 


20 


0 


3 




6 


0 


3 • 


62 


100 


90 ' 


7 




5 


5 







discriminate appropriately * too eaiy but appears t(i attract an 



Hero [i 'rclpt far 4 eupi of cMi ' H^flt to ncsir botlini 4 d liU 
A<!d iriih iiirfiafi 4 T. wjir 
ST. CDcoa 



U lableipwii* ^ccaiiM- Is • 12 • U 

r 

I, Mjw ihw « tsMcipooM thrr« ii «;» wcM 

0H tiblciitooni became ij, icpar H T. mit 

D. 14 tiMtJpwns because u. H e. c. 

to U. « I V. • 14 T. 

B. 1 hive no in»er 



J 

i 

z 

0 



Km ' 

TTie itoii does not apprtpriately discrininate, 



SG 



VI:I»5I()N I 

Grade 8 ^^^^^ 
All .1111 1110 All 1111 1110 
IK 





47 


85 


39 


67 


40 


17 


50 


0 


4 


0 


6 


0 


4 


0 




0 


4 


0 


5 




ISii% 



.1 




Ml un 1110 '.Hci: '^i" ^ " • 

0. ti. Ihlel IhtKiK If " J. / 



21 


75 


7 


49 


5 


63 




20 


17 


9 


0 


13 


0 


0 


0 



T T 



1 



E, \ )iivo no itiiKct • " 



■ri: 

it 

I 




■ Crade 8 



2. 

Grade 11 



nil luo Ail nil 1110 



33 



67 



0 



27 



1 



•0 



0 



6 



SO 



CHANGES 
None 



If 11 MtiWd It thiJ w|lfi? ^ 

A. About 7 b«iuM 13 1> JWiter thin 7 

- '- ■ H " 

D. Ab«it7crlbi««iiJ2»l0.7i3 P'^^^w A 



The 
eroii 



ciw;c£s . 

:;Dnp the itea »d replace it. 



IF, 



ERIC 



IS 



9 0 0 




REA5W : 

Tlio itt-ti discrlMtes appropriately. 



5E 



Trill 1 . TVmcljhtu oi» siJc "A" Ulwcc W^K of ihc veU^J «. ti^» T ^ 
All. HI J lUO TrlMI)' rjuvv#t>w$ldc^V^ $hon;lJ4"r h«, 



34 


95 


25 


26 


0 


13 


IS 


s 


47 


6 


0 


3 



.0 



is About 7 bvciuio 4 4 I 0 S 
6f IV? 



S, t havi no vm\ 



f 
I 

X 
0 




A a " 

ft » 



.11 

■-is 



All 
0 


nil 

i 

I 

(/} 

0 


nio 

0 


All 
S 


0 


1110 
8 


All 
8 


nn 

0 


1110 »o 
3 " 


14 


33 


41 


74 


36 


21 


so 


15 


34__ 


67 


32 


17 


39 


25 


5 


40 


38.. 
14 


0 


12 


9 


a 


31 


45 


50 


0, 




•0 


6 


IS 


0 


13 



REASON . 

The content appoars too conplex - it wy lie addlii! confusion. 



A Tint ' '""^hP^ nt'^c^^ \\ : feet frcn the It'H »ij < fcpt thi tiMf, 5" rlsi U> . V 

A. We Mwilcn kill bi' Urj[(r iltart rt?mcr tin iln; / J, 

-T A.. 



40 C About J ft. {ro« tliB hipVccmo 2 1 < ■ 2J I /. 

■ ■• ^ H '-^ 



lirtpr J » 1 1 4 ;ft, ♦ I ft, • J ft* / * V Vw/ 

^ A- — 

C. I liBVo no inivet ^ 



QIANGES 
None' .. 



Tlieiten appein:to,discriiilnate'ippKpriK«iy,ilthoiijh;U li difficult^- ';:}:<X'i^!$ii 



I 



VliRSION .1 



I7C 



20 


19 


23 


3 


17 


6 


44 


56 




17 


6 




f 

11 


2 





I6C. 



Soil)' 



loa 



All 1000 1100 nil 



Ikre M I0H( rocipet for Itool ALJe 

























u 


Viter 


ux 





75 , 


90 


97 


100 


10 


6 


5 


0 


8 


4 


0 


0 


2 


0 


0 


0 


5 


0 


0 


0 



llN Ruch fovdcr.ls needr4 for S ^uirt} o( Kool KUi 



h 3 pk( becalise t qti M * S qti 
ifld 3 ptit O pk£ • I pi| 

C. About S bcciuiD it vould havQ to bo Mn 

HpliC boeiuii It ii ttit two Rlxturo 

B. I hivo no »nj«r ' 



/ 



All tOOO 1100 ten ii > Uxinf t( itne notric inJ Encllsh ■iiiurt: 



8 


2 


3 


54 


81 


61 


11 


0 


6 


IS 


6 


26 


12 


10 


3 



I5C. 



A, About !t because it bas to bo kdib 



:0 



About 20 bociu'iC 

30 ci 1 10 C4 MO ca • SO c< 
'ind 12 iA H i)) * 4 in > 20 In 



C. AbMt 19 becatise It tens tttkl way 

p. About 32 bccaiiie 

SO c« ^ 20 at " SO C* 
ifld 12 la f 20 iA ■ 32 in 

2, I hm t)0 «iiivcr 



/ 



4 inches 


" 10.2 ctt 


12 indies 


> 30,6 CM 




■ SO ci 



8C. 

All lOOO 1100 nil ,iiMntlMolBVl»lpnimcnh»!O^M«liMofictw, UI inchietilkCuW^^^ — 



28 


25 


32 


30 


-0 


About J-IO Inchci bccmSf ^ 
12 X 12 • H< 21 X :i - 441 {J titici M iuch) 

A 

HiQ uxe but Mltb tqusrct 


3 


4 


0 


0 


B, 


3 


4 


0 


0 


C» 


Ui» than W 11], Intlm bceaujf ilio iquJrtJ "« lujcr / 


60 


65 


65 


70 


0. 


Horc than 10 iq, luehoi bccnuic it i» largtr J 


6 


2 


3 


0 


Si 


I hftVO RC »wot 0 




Ml lOOQ 1100 



52 



23 



56 



16 



Sue il«yj drlvcj how on the frccua/, llcr jpecd 1j dlffnent weh tf, flftnJay lur 
llowc5t, Tiiesdsy hir next jlowfjl, Kcdncsitay her ncM JlowcM. TliUr^drtj' her upxi ?,|avrii 
ittd Friilar next jUwcji. Frid:)/ it lain the IcuM tin? to ^tt liwc, 'il;iirj.thi)' ilir 
iMit, Mtdneidiy t!.c ncn and » on, On wlilcl) da)f ibr^ it tal^c (lie mm\ lci(.o 
tito t}Ki i> it tbo leconj MSI ilouT 

A, fljund.iy ft Tucsds)' because Ih)' tro wcoiid fiu" cocli tHt) uf tlic vtcl 

B, Tliur)dif beciuac 



wst ipcod 1 Pri. 
■oit tiw Fri. 



^ Ci Vednesdiy bocauso it ii the niddle 

0l(ft Ofli t>f btciuio 
ton tiM 



Pri. 




»oft ipo(d lion. Tuci. 



tWed. 
I^cd. 



Kcd. 



^ Tues. 
lues. 



^ Tucs, 

TUQU. 



3 fiDi, 



3 



rti, 




VI-IISION 3 



vi:i>si(iN 4 



I4C' 



lie 



VERSION III 
All OOOO 1000 



GtANGES 
None 



VERSION IV 
Ml 0000 \m 1100 



26 



Hjry bii/i J llckch lo i nffU 4crf 90 tlclfltj m loM Jjw hvn 1 tIcUl to i nffit 
wbflr« 30 liiUis «ro lolJ ... Suo biiyi 5 ilekfti ip « rafflp vjipro 30(i tkkiu ire jaW. 

tfiilch £lrls l|iv( ibout the iiae chiuica Pf vlnnincf 

_^ A, Jin« inii lUry beciuse ihBlr'i ire the UitX tlckcti «L 

B. 5uQ uv) fliry beeiuiQ (ich hive ) tUliett J 

C. All |lrli have the itfli chinu ' • f 
^0 Jino tnJ IJjry btciwo 3 chucei In SO li the'iixa it i In JO ^ 

B, I hive AO infvcr ^ 



REASON 



All 0000 1000 



■il 


9 


90 


3 


0 


10 


: 1 


53 


0 


10 


27 


0 


1 


9 


0 



MM 'M: « t'Aiu^i red cf 30 r|.|i vill. If pict^i.i'J St m mm^ itit.^rt-als Iwl; lilc 



0 



/ 



' ^j^-i ('■■' ....fj .. .1 

■c,"V'.>'>-"| 



Of;. 



F,, ll'-na th'.'.fl I'BtauJC It i» wvlng 

C, 11 t'/K'iv; It daiijci 

D. Ill V.w.'. li 15 iI1Cre.^^rc It^ dl>t^i^c EiiZ"?' --^ 

Reduce to only luo illustrdtions. 



All 0000 lOOO 1100 AcirMvinMtieenitinteMof 30r?hrill, irplcwrfaficf.MNxi{«t^^^ 



71 


29 


69 


100 


5 


12 


15 


0 


10 


2!) 


0 


0 


5 


12 


0 


0 


fi 


18 


1 


0 


0 


0 




0 



01 becfuie.U tovei t^nl dliiiAcii iidi */ 



^ I, Kone af theic beciust It li Rovlns 

C» II bccaujo it chftfl^ei 
_ tl. n Umst it li incTdilni ill dlitwc? W^y^ '"^ 
E, 1 hivo no iniuer ^ 



No response 
REASON 

Kish to concentrate on results 
Wish to inctcasc correct responses. 



90 



ERIC 



All 


0000 


lOOO 


All 


0000 


1000 


1100 


iMClne that frojtinf: baJ hccii jprfaJ out \ inch on top af a laall 6" x V caW. Prfdict 
wh« thB ihltlflDsi be if titu sjw mm cf fr?siinc ^crc jnrfoJ rut over i K" » U" 


10 


Id 


10 


6 


12 


0 


11 


A, lloro thmi \ Inth lifc^iii^c It cover* l»! cskf / 


0 


0 


0 


fl 




fl 




t. UsN th.in '4 inch h r.iiijc it |ik>U that ^ay J 


69 


ss 


70 


69 


35 




72 


0 U!^s than inch bccntip it cover* cjif ^ 


14 


IB 


10 


8* 


18 


a 




U. tkirc th;iii '1 Indt ktmt there is nore cale ^ 


7 


9 


10 


6 


6 


0 


i\ 


fi, I have no mat 0 



Ndrq 



RliASON 



246 



AH «IA>HMaiaai 




Immm 



VI-R5I()N 3 



4C| 





0 


U 


10 


27 


0 


7 


U 


0 


72 


27 


iOfj ( 


10 


27 


0 



A» awl Ij •! ^ 5 - /I 

s 

I. XU'ii 5 UciV-'i ihfl •T" red li tliil lone / 



Alw: 4 bot»j» tho "t" rod li iliofter 















■ 0 0 c 




^■n 1 T * 


rr 

< 7 " 




AnsKCT T rewritten without including the proportion. 





i: 




{\ 


14 


55 


0 


0 




la 


0 


6 






l)J 


— ( 


\j 


0 


8 


17 



B, About S bcciue ihi T it thit \wt 

C, Abcu; e becju»0 ih« T tod «m * 
0 K^m i Vfwuii thi ^ w4 li jJtfiUt 

£, 1 hQVQ HQ ipivtr 



3 

I 





^^^^ : 

The problem in the original version 5uji;wt$ thlnVIa^ ina?rwrr:At<! 
foriliii level, 
to aake this tooro discrifiiinatin^. 



2C 



VERSION III 
All 0000 1000 UOO 

94 



4 



11 



59 



10 



94 



0 



0 



VERSION IV 
All 0000 1000 UOO 



66 



41 



69 



lon&ihil 



0 »lor« tlun 5 ptncUi Decawi It li bi^er ihu i •t^«n« 
II, ' uii thin 5 pmtllJ Icciuja It leiw tlm 

C, Aboyt i peticlli beciwe It km 4 t(Xtbooh 

D. S p«:lli btetiue tKrt li whit tht itudent d«ik ie«urfd 



i 
/ 

6 



CliANGO 
None 



REASON 

Appeared to discrimlnRte appropriately. 



IC 



VERSION III 



VERSION IV 



ERIC 



CHANGE 



Ma fill 



All 


0000 


1000 


1100 


All 


0000 


1000 


IIQO \[ 


11 


16 


0 


6 


6 


• 6 


fl 


6 


16 


21 


25. 


0 


14 


S5 


B 


11 


8 


10 


B 


6 


19 


55 


15 


17 


64 


42 


65 


87 


55 


6 


()f) 


61 


2 


3 


2 


0 


5 


18 


0 


6 



A, k )n4lUr b(C4ii)o thr litht m\i chmsf 3 

B, Bo iBtter bfcawe It il Jiffcrtnt ^ 

C, Do \\t >i:e b^'Au^r ih« ring is in the im pUif / 



0 B( mller K'ciuie thr rln; li juller 



REASON 

: v Appeari to discrlBluate.apprvprUtelyi 





4 



I4C, 



lie 



VERSION III 



CHANGE 
Hone 



Ml 


1000 


1100 


1110 


All 


1000 


1100 


1110 ^ 


61 


67 


55 


ay 


77 


62 


94 


93 


10 


10 


■0 


3 


B 


8 


0 


0 


25 


20 


45 


10 


13 


23 


66 


7 


1 


0 


0 


t 

0 


3 


8 


() 


0 


2 


2 


0 


0 


0 


0 


0 


0 



VERSION IV Thoio nituro hunt crotipi ir( thofdi for t mture hike, Mrt. h!i:Ttt * s tf*;;c?' 

Hf, UntnUHn. rilk-l it'.irr 

)c toicliQr vlth the lust nudentt to help Is: 
0 'Hr. H^lt bGcimi I'll liTf^r than S is lirin thin | / 



Ci )lr* ticnton ( Wn. Fclk b«ciu(^ they hiv« ik rott iiu^.t.'.ti JS 
D. VfSi Morons bac(us4 sho his le«9i studei^ts /- 
Uive no insvnr 0 

REASON 

SeciDingly appropriate discriulnition. 



All 1000 110i> Ulfl 



1 




90 








0 


0 






0 




If 


6 


10 




111 


5S 


0 


0 ' 



Four tJrs bvB lUffcrcnt spccJji Car A Is the tM\fM, Car 1) ilic t.;r^ ijJtcjt, Cur C tlic n«t 
imn, an J, Cir l» tlic nc\i fastest. Thi' fAntcst car vAf% the ICaM time lo -t^ TiHI tiilvs," 
iho f.;\i fasten c:r tbo itc^t lc3st tire mid s& en* Miith cur 1$ the ihlTil fastw. uitil taUs 



.0 



A ' 
1st fnstcst 

$ 

2st least title 



B. CirB 



C. Ko qir bfcivse they don't with up 

A 

D. Or C ticcaiise: 1st turn fast 

i ■ 

1st nost tiN( , 



^ld fustcst 
T 

V 

rnit Irast tlu^ 



II 

2nil ml fust 
i 



3nl fftJtcJt 
3ril least ti«Q 



3t(l nost fa^t 
3rJ M^i tUi! ' 



four tirs hm different spcedi; cit A i( the fistiit, Cir t iHs r.«: {uw\, Car C rv\ 
fistesti Kniii Cir D thQ mt fittest. Thp fistist cir cii:(s the leu*, tbt t: :z z.Ui, - 
All 1000 1100 1110 ihfliiBXt fastest ur thMfxt l(3S( tlBB inil so Di). IQiiUeirUtht third f;:uit tJes 
the thirJ least tics to lo 200 fiiliS? 



51 


15 


67 


93 


13 


31 


0 


0 


10 


23 


0 


0 


14 


8 


'22 


0 


12 


23 


11 


7 



.0 



Cir C btciusti 



111 (istist 
1st leist tlfii 



I. CirB ' 

C. Ko cir beuusi \Uf ^n^t ntch up 

D, CirCbtcibse; 



lit QOlt fist 

lit tnt tin* 



2Qli fistiit 

CAXl 
U kilt (iM 



2nd M5t fis( 

caub 

2nd Mit tlct 



IH finest 

OAHC 
Jrd lii!t t:^ 



2rd M\ 

CAKC 
3rd Mlt tlr.i. 



3 

0 - 



h I hive no insun 



Reaove arrows and vrite out Car A etc., 



Reduce ambiqulty, 



6C 



All 1000 noo iiio 



35 


^$ 


:o 


,"0 




17 


6 


h/ 


12 


4 


16 


33 


n 




45 


10 


11 


S 


*6 


0 



CKVGE 



O is Question 

ERLC 

"TO 



A'/ 
/V - 

A 10 ft. fliiS pfitc hii } sh.iilow 3? ft, Ions, A 10 ft. (trc has a jh/dow 7fi ft lor);. 
Ikv ions > yhaJjtf vlll i S ft, person hive? A / / \ 

0«.o« i; ft. K\',iiiM' n . ij ' ^ // ' 
Md:s-i3-^i; ' 



I. ,^hoiit 17 ft* \\ft^\M. it In hl|ir.pr Ih.n llir mn 

C, About 20 ft. because the pan is S ft. les^.«3 

n, Mwit IP fl, lif»aus«» It irn> tli.il mv / 

E, 1 have tio answer ^ 



J 



IOC 



Iten iro sme r^clpis for KmI Alet 









rool Aide 














fie 


Ic 


fat4r 


. 1^ 


^ qtS 



All 

li , 


1000 
I 


1100 
0 


UIO Il0 

7. 


16 


38 


22 


0 


9 


15 


0 


0 




31 


'/H 




3 


8 


0 


0 



A. \ p>( beuiDQ U Is t)i4 td/Lf ol't'jft 

t. 3 pig bficmse ms ♦ 1 ^t - 5 <its 
■ ipd J pk( ♦ 1 fl? • 3?'<t 

C. Abfiut 3 biciuse it vouid have to bi con 

UJ 2t) pit UckUlfl 4 Ittk « } (t^ * ^ '4'* 
^ ind2p^.r 'tpki'J'ipll 

C. ] hivo ng tnsuer 



• / 
3 

i 



Previous change m destructive, The question (6C2) negatively discri^tes. 



250 



5C 



VI:I>SII)N 3 



All 1000 1100 1110 



s 


10 


0 






:o 


lOO 


•5 ( 


i: 




0 


0 


6 




0 




6 


> 


0 


10 



Tti>l I J \^\\^ ilJf "A" baliticv ^ or the \M liii people 9n ilJc "5" 
Ttlal ni 6 jHVj'le on »lJc "A" jlioulil balance hci." n.my on iIJp 'T"! 
A, AiiOJt 10 VcMuift 2 wifo en "A" siiDuld bjliiiiev t« rorcj f L 

0 About i: kriiuiQ It im up 4 onj s H < 1} 

C. About 10 ^tttm it takes } lorc njtd 8 » 2 < ID 

D, Abcut 12 b«.'catii» It ^h^iilii «Qr( 
I hivi no in'wr 



Districtor "0'' changed from 12 to 11. 




All 1000 1100 UlO Trill II :4p«pJ«w«ldi"AniIincB|ofthMtt.»ii;Pi?i«;;j^ 
IVlil III '( peoplo on lido ^A" ihyjlii bilinc« "t"? 



9 


a 


11 


0 


68 


69 


78 


93 


8 


8 


6 


0 . 


5 


8 


0 


0 


9 


0 


6 


7 



A. Aiiout 10 btcius9 3 on "A" ihotild biluM tw) oore J ' 
PUT 

0 Abbvt ]} btduie It |MI up ^ uid 8 « < • 12 

C. About 10 beciiutf it tike] } oirt ud I * 2 k 10 

D. About 11 biciuii it Md bi ^re 
E4 1 hkvi n9 initftr 



REASON 

Wished to have "D" be b more plausible guesi. 




3C 



VERSION III 



VERSION IV 



All 


1000 


1100 


1110 


All 


1000 


IIDO 


1110 


Ihli porsDO lUiUnt dovn « hilt loolit it her vitch 
Eich leeond sh< put) 1 itick in the im. t^.it cast 
Ukol/ Muld bo th« piiten ol th»e uicki? 


10 


15 


6 


0 


6 


15 


6 


0 


Ai I b(c<ui9 iho mm ouh lecond 




12 


17 


0 


10 


17 


23 


6 


21 


1, II b»iuiQ It'll iiteop hill 


/ 


12 


56 


3 


3 


13 


15 


0 


7 


C. t or 11 btciuii shQ li t9vin| 


J 


60 


8 


64 


B7 


60 


38 


89 


64 


^ I b«ciuiB ^cr ipccd is c!)in{lRt 


f 
0 


f) 


4 


0 


0 


4 


8 


Q 


7 


E, 1 htvf tio inifer 




CHANCE 



IC 



VliliiilON 111 



VL'kSlON IV 



RHASON 

TTic problem appears easy yet it does diacriainate. to the 
the results for Grade 5 students fnon wsttrsj is eia.:ir.ed 
it appears to be an appropriate question. 



All 


1000 


1100 


1110 


All 


1000 


1100 


1110 


A 

on 


21 


15 


26 


0 


22 


30 


17 


21 




69 


67 


74 


1)7 


47 


23 


56 


71 


2 


0 


0 


0 


8 


0 


11 


0 


7 


13 


D 


3 


19 


46 


17 


7 


2 


6 


0 


0 


4 


0 


0 


0 



A. B«cwfl Utitx bBuuif the ihidw tprcids out 

0 IteeoM v.i\\ix beeiuK tht lichi ri/i dir.'i 
spreid II tuch 

C Stiy cti« lUfl becttttt it's tht rln| 

D. itttm^ \\ritx boctute th« bulb Ij fuhet vtiy 

I I I hive no iniver 




ERIC 



None 



REASfW 

Appears nearly too easy .yet does discrininate. Scores of Crade 3;;' 
(non-iiiastcrs) are lover. . • ';: i 



252 



• 



VI-HSION 3 



VI-I^SION 4 



VliKSlON HI 



VBSIOH IV 



I8F 



All 


noo 


1110 


nil 


All 


noo 


1110 


nil 


Dii 


17 




1 


5 


16 




7 


0 




52 


49 




90 


S8 


50 


93 


100 


13 


10 


' 0 


5 


12 


6 


) 


0 


11 


3 


() 


0 


0 


0 


0 


0 


6 


3 


0 


0 


6 


0 


0 


0 



None 



A, Ahgui II flit, Thi $ foot luti ii i wri thin the 3 fitt cm ft' 11 ft« ii 
2 ion ihw 9 fiet ' * 

0AbovtlSf»tbeciusi3/9*S/lS f 

C, About II fert beciiufl 9 ♦ 3 • 12 ^ ^ 

About 18 felt beuuM It thould U Mci as fir J 

E« I hive no insMtr 




REASON 



Ihe item has leuonable overall difficulty and discrldnates vdl. 



17F 



I 



All noo 1110 lin . JjJJ^Jj «UHi^^^^^^^ WiatwinMapplrswpii;hlf6i).i.kj 





J 


5 




15 


'% 


0 


S 




71 


95 


90 ( 




0 


0 


0 


5 


0 


5 


0 



^ A. 9'| Ibi because 6 * S « U 
so 

Ilj • J . pij 



D. J or 4 lbs be:3«Jc it loah thm nr I 
L I have no anikcr ^ 




Gi\SGE 



"P" fn"^n .1 j;ues5 qu^<ticn to an aJdition t)*pc an5>icr, 

IIF 



I 



i: 




0 


0 


IS- 


n 




53 








^- 




;s 




0 


s 


0 




p 









\^ the wi of S wwnJj? 

A. Mon thu 1?S feet bKsujt it i$ still novin^ 
S, Lesi thin 40Q fen ^^mn it is only 2 seconds noro / 



y 330 feet beciuse 1?J x S - JJO 

D. tOO feet bveiUiQ 5 \ I kc. • S sec. 

198 ft. * Ht, » 200 ft. 



I. 1 b:(Yc no insv«r 



s •iiJ no; select this distractor. ITie probiero appeared to lie too oasy. 



jino li vel^lni out ippld on this ^uptntrkt scilt, >^»t v]ll K ipplci utith If t;*^}* 
All 1100 1110 nU vel(hl>}lbir 

A, S<( Ibl beuuM 6 * I - U }L 



9 


5 


0 


0 


23 


44' 


14 


0 


34 


28 


71 


67 


27 


22 


14 


33 


^ 


0 


0 


0 



H. 3 er 4 Ibi beciuio it is torg ^ 

0 l\ lb» bcciuio li X 1< Oil ^ 

D' Jl|bccitJm!j4 Ifi'Jj.JI, ' / 

Et I hivfl Rd mver 0 



REASON 



TTiis gives a clear distractor for a Level 2 reasoncr. 

llic question previously cane across too easy probably because it lacked this type of distracior, 



Pw far M\\\ it have travelci! 


All 


1)00 


11)0 


111) 


A( 
b, 


z 




42 


0 








la 


5 




1 


4 


0 


0 ■ 


0 


f 


fi5 


r»2 


DO 




3 


7 


3 


3 


0 


0 


5 


3 


7 


0 



A. ^loro llian 158 feet bcc4i/.o it it '.illl c'rtinj ^ 



D. POO Ittt Ltciuve J iH, ♦ 3 Ju. » 5 jtc. 

IS^fl, -a.ft. »2i:fi. 

F.. I h'h'f» ni)',.<'r 



J- 



CIIANCI; 



"!" from 198 feet to 400 feet to make it a pltuiible answer,. 



254 



VI-HSION 4 



I OF. 



8F, 



All 1100 mo nil 





:6 


:o 


:5 


IS 




40 






5: 


10 


:o 


Ic 


10 


1: 


10 • 


r 




15 


40 



A lirplMf vlttf tajf frpi t),c j ^ ' ; \ L . 
warned 7 cn. Iw;. fckii wrtiU b^. V' h \L 
' ! :rr^ a pjitcrn with \^ 

0 $7 ca. because 6/? x 10 • $j 

B. IK CO. tefc it )ool$ th,^j / 

C. 22 CB. kc5U56 19 ♦ 3 » :^ / 
Pi 13 cc. but tlip s^iuroi invilj J^' 1^^^'/'" / 



All 


noD 


1110 


nil 


All 


liOU 


1110 


nil 


Jin U1I3 2 heipini teispoont of Tinj powder iflti m I ot. | 
needed for thi itae nixture vi^h 27 oi. cf Mter? 




52 


57 


85 


48 


22 


93 


67 


0 Abwt 7 teispooM becJMB 27 x 2 ttp. » 6 V< tsp 


17 


4f 




0 


23 


33 


0 


33 


8 

K. About 21 tcDpOons becsuse 27 oz 


















-8 2 tip. 4 


20 


3 


43 


15 


22 


39 


7 


0 


Ct Hon thin 1 leispwiJ beti^iSB there ii eon wter 


5 


0 


0 


0 


4 


0 


0 


0 


^ tiijpocBi beciBJe ii it the iise Piiiwi 


10 


0 


0 


0 


1 


6 


0 


0 


B> 1 hivfl no inivtr 



ClIAfJGl: . 

None 




«ing length ]|_cx changej to 19 cm. 



/ 
0 



REASON. 



Die Item has dlscrlnination. It appears too hard but Bore me 
was dcsirtd. 



All 


1100 


1110 


nil 


A wdil ijirplmi Bins Mdfl frn t^e 2 ta. pittcm ih<wn 
wmei" CP. Imi, Ifhit wwld lo the ler.-ih of vi± 
1 wins Mdu frOB 1 pattern with iqursi tliit irt i cs.? 


37 . 


33 


57 


33 


0 57 ci. becjuii 6/J 1 13 . $7 / ^ 


5 ■ 


0 


7 


0 


)• 11 CI. bcciusi It thn MX i 


14 




7 


0 


C. 22 CI, bectuie 19 * 3 > 32 X 


26 


V) 


29 


67 


Di 19 en. but thi i(|uaTts be lii{er / 


16 


17 


0 


0 


Bi t hive no iniuor ^ 




llliASON 



Tills was an error in the stem, TTie probleiD comes off as too hard, 



2F, 



CHANGE 
None 



13 


6 


6 


0 


8 


6 


0 


D 


17 ' 


23 


3' 


0 


9- 


'17" 


0 


'o" 


43 


10 


87 


100 


35 


2B 


79 


100 


22 . 


S5 


3 


D 


23 


33 


14 


0 


5 


6 


0 


0 


23 


17 


7 


0 ■ 



Hen li ilctch »| of in ilrpjine. S^«i:li «1 U 7 punciJ wliihs : \m\t'. rX'\. ^^t•t^ \\ 

/ 



REASON 

Appears to be b super discriminator, 




K Sefiu to bo 6 

B; Abfrut 7 betiusi Ithii to be tin 
0 About 5 beciuse I Is jbout J ^ 

D, About « beciuse 12 Is S rwe thin 7 J 
' w4 8 li 5 Bon c)iu 3 

B. Ihmno,«wcr ^ ^ '^'^^'^ ' 



256:- 





t 



i 



3 



VIIRSION 4 



All 1110 nil 
25 



25 



34 



11 10 



50 



CHANGE 
None 



0 



On Ow ra»p illuUMted thf cart vdjht 1) b;Jinff J bv • 
All 1 1 in 1111 «U'>^> ^" '^T'^^ >^mt of ii nefJrJ te ' h\t\\ 

All lUl) IIU bilmH05pfMrueIjhui20°? Anjlc cIFFii^ 

K, 135 beeiuie IQO x ^00 ■ 135 



22 



29 



0 



0 



C. 177 bflcaw* li jdci up 17 for ewrr 100 

0 133 bc:a; 



icfouie ICO ■ 155 
Bt 1 hiYt no mm 



REASON 



i 

/ 




TTie. itcB appears to be discrininating appropriately, 



I5F, 



A Umf h'p ukY of thfl dlJUnci! hi trivcU, Ho finds that In 4 Binutcj tio 
;A11 1110 nil triv(l» 3 bUo/ in iO ainutes h »lle)> 1^ contte it thli i^t^ how Unc ulU It 
Urt hla to trtvd ]0 Dlicif 

OistincQ Ti(ce 
Tain 

7l]BiU) . IQain 
ID ilUi r lin 



21 



.0 



About 13 dnutts bcciuto 

4t5in, 10 pin. , 15 1/3 pin> 
TwITj ■ TTiilei^ iOaiiej 



I.- 13 cinutes beciuif 10 - ■ 2i] tiles 

C. M 13 pitnjte] beciuif 4 j jo ■ 15 1/3 

D. ^ut 14 bcc4u»i: 71] O • Id 

Bid 10^» 4 « 14 

E. I hiyi no m\nt 



PEASO:; 



^xii'A the student t6 view thje correct answer sooner. 



!0F, 



3 

0 



t frcpW)' Jvlvcr Irrp track of the Jlstsiifp travels \\t thii t^t ii 4 ii\T,\,\{t u 
All 1110 1111 tw»'i'l* ^'j «11"' If he cMtiiia:» ii Uiy h,^ Icr; t(ili :i 



18 


17 


30 




5 


■ S 


25 


20 


0 


38 


. 50 


65 


5 


■ t 
10 


Q 



taVo lu:i to travci 10 pileiT 

A. AlinK 13 nirutcs bcca\i5i* 4 1 10 « 13 1 



B. About 15 oinutes bcciuse 10 - 7l] ■ 2!i liles 

10Ui}> 121) Bin. 

C. ^t 14 bfuusc 7<i * 3 - )0H 

... 10»4«W 



10 6ilc* ♦ m, 

/ 



65 _ [ D.) About 13 Kiniitcs beeau)? 4 njn. _ ■ IOein_>__ ■ 13 1'^ r|p. if 



I hivfl no utSKcT 



SHitclid order: A to C> D to A, and C to D, 



All 1110 llll '"^'^ * * ^^^^ ^^^^^ ' ^"'^'"1! ^ ^' 

AilJ I'lth stirrin;; <i T. ^Kjir 

\[v^ piu/ of :u:;iir wi^iild be nccdcil to nakc U cups of ilil^ cocoa? 

A, lis liiVltipWriltiHatisc 6 iji ja J . 

a, lloro tl.an C tablespoons hecauso there is norc cocoj / 

(cJ la tiblespcoftj btcauje d T. su!i.u 18 T. <ug;ir 

^ 4 c. coctn ■ Ti! c, cotiia / 

D. )< i;bkJpwM Ucausc 4 c. ♦ 8 c. • 12 c. 5 
JO 6 T. ♦ B T. ■ 14 T. 

E. 1 ).av5 Msvtr 



5 



0 



I i M i ifflWff i Hiin 'ved language froa distractor "C". 



All 1110 llll Hffo in r«ipf for 4 cups of cocoa: Heat to wit bolUnj 4 c. Rilk 

AilJ kith stirrin; 6 I i^tt 
J T. Cnw 

llo4 wny tiMMpoons of jujar wuld bf nee JeJ to wVc i: cup cf ihw 



SO 



i: 


n 


0 


38 


36 


loo 


lb 


14 


0 


6 


0 


0 



(1 



A. IS tablopocTii b(cau» 6 ^ 1? ■ II 
\\, Morr ublr!>)vcns b(cau» ther^ li sVrr CNO« / 



© 



I IT 



D. 14 tibl»}'J<Mi} because 4 e. * S C. ■ 1? c« 

so 6 T. • 6 T, ■ 14 T. 

E, 1 h^ve 110 mm 



X 

0 



mm 

Hie problem ciiine acw?;* as too easy. It was suspecteJ that the wor^s with ansver,' 

"C" iiii)«lit lijivc l'i»oii II c:iiise, ' . ., 



Li 





• 



m 



VI[I!SI()N 4 



All mo mi 



0 



CIIANQ; 
None 



AU UIO 1111 lo^sinfl thit concrete hii been ziui to u\t i pxtlo 4 ft. x 4 ft. arJ l| i 'foot iV.cL Hm 
thick, hi U this c^ncitti be if it li imii i}:tii m cvir u 1 ft. x i (t* vri? 

57 0 Wt, thick ^fcwif 16 • if ^ 



19 



58 



7 



71 



J3 



K. 1 ft. thid teciuii 1 

T 4. r 



C. I ft. ihUk Uuw 1 ii Iiii t)iin ) 
T T } 

D. 1 ft. thick Ucittio It MJ be )fu 



/ 

0 



i 



El t.hmnoinivn 

RliASnN 

Xtih ium I'xhiMts (ood illscrinlnation. 



IF. 



All UIO an 



8 


3 


0 




U 


so 


:3 


40 


i 


u 




45 


IS 


n 


0 



CiaNCE 



A rlnj 3 Ir.O,^^ it.'ou Is 2'fcDt froB) the liiht ami 4 ftet tm thfi tMilc Dip T rluft hn^ 
1 5" thiiw, I'^ifo ill'. Jill ft 4" rinc le pluciii) lo mIj the sa-o M» ^liU'l'''-' 

A. 1h) >!>;^^' vlll ht |ir{cr \\M f khmvof tl 
it yWM, 

^ Abwt 1 ft. ffo^ \\a Irp bOHMJO I • ?jl 



C AlJtit 3 fti fron tl.9 becww H 4 • 2.7 



ft, Ib'nJt ] ft. fron ttt btcfm'i tin: li T JL 
lirctr S M « 4 it)<l 2ft. * i ft. O fti 

It \ ha« M intxtr ^ 



1^ "li" chMgcd Mi\\\ all proportions shomi, 

TOT 



Ml 11 
17 



34 



10 mi 



None 



Win 'TV«wci[h»0[Jsi5le''A''bilinMthiMOftht Jiitvflrttlflflildf T 
All lilO 1111 TriiWNFourweishttenililiT. Six oti tldv "f* 

TtlM WU hit «ijhtj M )lle '•A" then tMi biUnct hoK iwr V(l(hti on il^ Tf 



22 



43 



Ml 1110 nil 



lIliASlW 



33 



Ai About 8 beciuii | j , 

©Aboyit ( brctuti 6 I.S 
4"T 

C. About 7biciui0 4 M ■ S 
4 

T r 



J 

0 



0*^ 03 
A 6 



REASON 

The ItcB appfflrs to be appropriate^ 



12 


0 


0 


U 


14 




29 


43 


33 




l\ 


0 


1!) ■ 




33 



A Tint i M<i «rfM ii : Utx frca the U!<t «J 4 f»<t ff.*? t)if tiV.i, ?if J" Ui 

1 fill il.R.Uu . «il mI^. >L I • • 

J. 

,0 



A, t^Mha^W bf Urpil thw J" «hfwvif the tlnf / 
it plM. 

) Abwt J fi, frci thi Iwp bf(iut» J > J • Jjl 

Ab9ut S fi. fYi?D the Umu : 1 4 ■ 2,7 J i 

rtoiit 5 ft. frm the )4i«n bfCiuif tbf rln{ li <i2 JlJ' ^ jJ**^ * 
litter 3 O < 4 inJ ?ft, ♦ 1 ft. i J ft. W ' \^ • 

C, ) hwa ftiii^^t ^ : 



IWh tictP U iwrv iill'l'lv'iilt tlKin poMlbly bcc;uiJf a iW^m Jew the 6'? 
propgytion and no place to apply it, ^ 

^60 1 



VI-lJSIllN 4 




:RSI()N5 



VI:R5I()N6 



I4C 



VERSION IV 
Sth 



VERSION V 
12th 



VERSION vr 

sth 



None 



All 


ooou 


lOOD 


1100 


All 

0 


0000 


lOOQ 


1100 


All 


OOOD 
0 


1000 
0 


1 100 viic 
Mil 

D 


/ 


15 






4 




or 


11 


2S 


10 


3 


10 


24 


6 




0 








8 


16 


6 


2 


15 


24 


0 




3 








10 


20 


IS 


3 


63 


24 


94 




91 






68 


29 


69 


90 


0 


0 


0 




I 








3 


9 


5 


2 



12th 'bl3 
Masters T > 



.6177 
12,9065 



8th bis - 0,5980 
T * 15*3799 



D vlino PO lldctn iiro jrlJ Jmp hy 1 ilcUl l« a raffle 
!,ut liuys 5 IIcLpU to I raffle vhrrc 'A' \UUu «rv hM. 



cli lirvit ;ilibiit \k ^a»p rliuico of viiininsl 

A, Jaii« Qii«] 11)0' )i^v;iuM! iMr't m ilio )c»^i Mm 

P. Suci mill Hiry IccnuM' Mfh have 5 tlchtj 

C. All Kirlfi hivo Uic im chncc 



i 



90 0 0""'' '111'' ^'Jry bpcBusp i dunces Jn ii iho >fttie as 1 in JO ^ 



U 1 litivi' iin pn^tfdr 



RHA£0(i 



The item appears to have appropriate discrisinatlon 



3C 



VHRSIOIf IV 
8th 

ojoo iQoo \m 



35 



0 



11 



lie 



VERSION V 
12th 

All 0000 1000 llOQ 
0 I 



D 



D 



0 



Uth 'bls« ,4376 
Huten !• 7,9959 



ERIC 



VERSION IV 
Sth 



VERSION V 
12th 



VERSION VI 
Sth 



All 

0 


0000 
0 


1000 
8 


1100 

0 


All 
0 


0000 


1000 1100 

1 


All 

0 


OOOO 
0 


1000 
0 


HOD 


71 


29 


69 


lOO 


94 




/■ 


71 


29 


82 


82 ( 


5 


12 


15 


0 


4 




/l 




9 


26 


4 


6 


10 


29 


0 


0 


1 






8 


22 


1 


2 


5 


12 


0 


. ? 


0 








9 


13 


10 


6 


6 


IB 


7 


0 


0 








4 


9 


3 


3 



III] niponiA 

D, No Df ilirM hmm it 1a nfivlnE 

C, II kmM It ihw^H 

D, n ktmt 11 li lntrcii»lni lu dlunn^ ^^^T'*^" 

E, 1 biivo no inswcr ^ 



CHANGE 
None 



12th 'bla 2.5465 
Masters T « 10.7241 



Sth ^Is • 0.5577 
T- 13.8522 



The Iten) discriminates Nell and has appropriate difficulty. 



ln,it;ir Ibt \m\\iz h\ \m xprctn! m \ Inch llild (tn top of a m\) > C" trhr. hrtlkt 
vhui the tliliktic'i^ li'iiilJ bp li liiD I'm af frPMiii;; voro Vfrmd w\ w\ 9 )^ \ 12" 

A, Ibk'r tlinn ■< liicli liccotiso it covert \m »U / 

II, It^Mliait !i Indi bociitJkO it ItfOU thit Ml/, 3 

0 than \ Incli bocnijio it mm More (»)i(i i| 

D, \\m thin l( iitdi bocau^o therf If noro cilie 2» 



E, ] liRVQ no milder 

aiANGG 
Illustration added 



VISION VI 
Sth 

AH 0000 IDOO 1100 



I; , fit- ti ; ( ri(...(i:i;. I, . , i,jf„ , I , ,v,i n,t \ , I,,,, „r J, , 

t il 111.. Iliir'i.iiX II..,.! I k . f • II . . . < 



1 


I 


. „0 


2 f^'- 


5 


9 


5 


3 


6 


12 


4 


2 


72 


.^8 


80 


84 


12 


20 


8 


8 


4 


9 


1 


2 


eth 


'bia* 


0,5677 



«1 r;i; I'll" i. i;' ? I* 



/ 



^ 11, tlisn I, jjit!) |.[-cft|iM' It iwU tl.r,t J L 



ll» I li.'i*' im iiiivi'i 



Ti 14.2)50 



/ 

/ 



264 



REASON 

Succei^i on this problem (or the Ci level student should be possible 
without abstractly vlewin; vhat the area change doiands, 



2C 



mm iv 

fith 



mm V 

12th 



mm VI 

8th 



All 


0000 


1000 


HOC 


AU 

(1 


0000 


1000 

. J 


IW 


All 
ft 


OflOO 
IS 


1000 


1 


4 


12 


0 


6 


3 






' i ' 


10 ' 


.1 


s 


14 


3S 


0 


Q 


5 






' 13 


21 


0 


J 


6 


IR 


() 












8 


I'J 


(i 


5 


62 


6 


02 


72 


91 








S!) 


25 


BO 


74 


13 


0 


6 


17 


0 






J..,, 


9 


J 


U 



an ' 



12(h ^ii* .5!)7S 
Hute» T> 12.2423 



ath n}ls* .S7il4 
T» U.6S67 



I;, IfrM 1 1'Vi.K'i' ;ir T r<^ lit tU( 
r. M^'iii dl'u -IT \\i "ft" thi vjis /i 
(I'M ^ \nmft thv "V* ivd'ji \Ui\tT 



:0 



J 
y 



.fey 



Hio iten ippeaTs to dlicrUinxto ipproprlitil; 




racnv VERSION V 
Stl" I2lh 

[All » lO'jO U!i|]:.:;i 0000 1000 1100 

' i 1 ! oh I 



6f 1 11 I 



10 ' JLL 1 



TT 



5 I 



£5| 93 



t 



ot 0 

n — 



OUNCES ^hlJ- .5762 

Kiitcri T- 11.5831 

WJeJ rurix *rlth intcjcr vilws 



IC 



traili'.i^'^ ill A it 4 it'xtuMk un;;!i^ k'iiie. *!iif It i uidtr'i M. miwM In |«Rcil 
fy Witt ihm fi pmcil: Uwt li in Hlch ihip ft itWm Jtrl/ t 



D. I hmw M'ii M \ht mim dcik mm^ 
C. I hi Ir.iusr 



i 



VERSION IV 
Sth 

All 0000 1000 1100 



ft Mi»ti'.H'!. ,lr.L 



3 


3 


10 


)i>i 

0 


64 


41 


65 


7^ ^ 


5 


11 


6 


2 


6 


9 


3 


8 


7 


17 


3 


2 


IS 


15 


14 


16 




niiAS&N 




5th hii 
T 



1^ 



'■1 A)irvi.lp,H,|)sh;.'i-i, h 
• ,4407 



Wished to BOH broadly t^tn tho proportion mswc:. 



0 



Uh 



VfchSION V 
12th 



j-Ml '.'a-J-T ICiM :!C0]A!1 OOfiO 



1100 



b \ ^: \>[ 3 




14 1 3S 


S' 11 


6 




<^ 




i9 


31 


■ U 17) 


0 








55 




61 ; 




y 


( 


S 


11 


0 6! 


0 









!('!''!■ iiil?;' " ^i^'' ^ u,f rin 

'■'■Hir 111.'.'.. I',; ii,-;,! u,,i|j 
^. '■n.ni..rK.i,H'|(|Mliff,Mi,i 

^ I'« tU Mli 111.. U'.IU^p U \t 

I'' U-Ilirkl'lnt 11,1, 

1 hirt ttb iQv.tr 



12lh 'bis* 
AJJed th« "thidy> cr I tullir riA|. 




VliRSlON IV 
Sth 

All 0000 lOOO 1100 ■ ^rln^liLMJM', 





0 




l( 


10 






ljZZI 


lil 




3 


3 


u 


22 


Ji- 


_J 


57 


17 


70 




6 


11 


' 4 





UKt ni I |i»M lul,. liv ;|,M |.yit, ,,„|^ , ,) ,f 
.1 III fli" y.>:>i iMt( \k khj^v «f Die r rii;j m ** 



'i. fir )»i{tr twit H 1> JfffMii 



Sth ^t)is* 0.6458 
T« 17,5747 




This li Mtt difficuU thin cthir im for tho iNd. Klihoi to (ive a bodtl fot tb 
iu{(flited chaA(fl, Succmi of rlnj tlji it the Ci livtl ihould m d«iind thit thi itudnt - 



ibitiict Mhflt the chiA^o would look lik», 



VERSION IV 
2- 8th 



VERSION V 
12th 



VERSION VI 
8lh 



All 


1000 


1100 


1110 


AH 
0 


lODO 


1100 


1110 1 
0 


All 
2 


1000 
6 


1100 

0 


1110 
0 


77 


62 


94 


93 


95 




1 


93 


67 


54 


92 


94 


8 


B 


0 


0 







7 


10 


11 


2 




\l 


23 


66 


7 


1 


-1- 


0 


14 


IS 


6 




3 


8 


0 


0 


1 






0 






0 




Q 


0 


0 


.0 


1 _p 






0 




0 





CHANGES 
. None 



lie 



12th h\i «■ ,563$ 
Nastors T» 11.2036 



mm IV 

8th 



f Aj >Ir. Ilnll licMii^o 5 is lomcr lhan S Is U^jfir ihi>n 8 



|5t5. />m)rcws - S Mmlfnts 



1 



\ 



Bth ^is- .560 
T« 13,936: 



VERSION V 
12th 



D Mr, Demon f. HrJ. Fdk bccwjc 2 is larficr thin 1 is larcfr thoij 1 

C, Mr, li^iilMi t Hr5. Fclk brjwuc Uey havo ffo«t siudcntJ 

D, Mrs, Aiiilrciiis bccauso sfio hpJ fewer itiidcnts f 

E, I li;ivc oiittvcf 

REASON 



X 

I 

0 



llic item matches well the appropriate difficulty for this 
level and discriminates veil, 



VERSION VI 



All 


1000 


1100 


1110 


All 
0 


1000 


11^0 


1110 

■ 0 


All 
1 


LOOO 
6 


1100 
D 


1110 fas 
the 
tho 


SI 


15 


67 


.. 

93 


86 






86 


5S 


51 


81 


73 ■ 


13 


31 


0 


0 


3 




i 


0 


\i ■ ■,, 


5 


I 


10 


23 


Q 


0 


4 




7 


U 


IS 


2 


3 


14 


B 


22 


0 


7 




7 


17 ' 


27 


11 


21 


12 


23 


11 


?! 0 






0 


6 


U 


2 


1 



©No r<ipw»fl 
m A 



CAU 

111 iMlt.tiBD 

Ci flo m bcctus( thy W\ natc'i up 

p, Cir C keiuici 1st m\ fait 
CAR A 
]st Rosi Wn 



2DiI fD^tCM 

ind least ti&e 



h\ lost Tasi 
m\ tine 



Jul faJtMt 
3rd liasi tine 



CAHC 
Iril n{iH tine 



/ 
3 



ai)\NGES 
Xoae 

IOC 



12th ^iM ,5895 8th ^Is • 0,5451 
Hftstora T« U,B009 T* 13, W 



Tho i\n hii5 excellent discrtnination and appropriate 
difficulty* 



VER.SIflN IV 



VERSION V 
I2th 



\TP^ltlN VI 



All 


IflOO 


IIUO 


1110 


All 
0 


1000 


1100 


IIIO 
0 


All 
0 


lODO 
fl 


uoo 

0 


1110 . 
0 


5 


8 


0 


7 


0 






0 


2 


6 


0 


16 


38 


22 


0 


7 




I: 


7 


13 


25 


3 


1 


9 


15 


0 


0 


; 




0 


10 


20 


2 


0 


68 


31 


J8 


93 


91 


... 




93 


69 


41 


■94 


;97' 


3 


a 


0 


0 


0 






0 


fi 


8 


2 


I 



Aide 












he 




KnUr 




Mil 



HaitPTJ T« U.70U 



8th ^ll* 0,604S 
T' 1I.M3B 



H, \ fit lirrMK ll iJ llii >«w »l>'»if 

S fll Iww Mis ' I 111 • 5 qti 
ma ! fit • 1 fl( • ¥i: 

p, fiiwt 5 iKctwc il mM \m \t k pore 

0)1 flf, IlltlllSf ( Hi! I I It 'Ml' 
mill 1 |il( 1 1( • J"! H 

B. 1 Imvt no MiKer 
REASON 



/ 
i 

i 

0 



268 



11)0 ItM hii good dlicriDinitlan ond i good dlffUuUy Uval. 



5C 



^-EiisiCfi IV vcusias V 

Bth nth 

Ml icoo \m niO;|Aii 1030 uoo mo 

I I 



s I n 



0 



0 



Dt! Si 



JILL 



0 



ID 



5 



Kisters T • 



9.2681 



Hatiui wie JoW apparently pToportioniil 



.0 



VF.WV1 

AIJ 1000 1100 UlO iii'ii I >i''?;t'f I. 

lil't 11 4|i'v«. !■ 

■■) 111 

Kb rfipJUM 

(.(1 -I' 



4 


3 


T 


I1 

1 


14 


23 


11) 


5 


3S 


8 


ss 


63 


20 


54 


10 


13 


16 


23 


IS 


7 


U 


10 


3 


7 



T* Ili6l66 



.0 



II. Al'."; liloi,.;.:. i,, 

t., I Vm III) rui<i ) 

'reason 



h; ..r )ii i: 



I 
0 



ths Ltet did not dlscrlBlnitr wtU betixcn Uvel 1 and U\el II. 



3C. 



VERSION 
8th 



VERSION V ' 



VERSION VI 
Bth 



'flit J |iorsM» <Jiwn !< Ml) loi'li pt U'V v'stch 



All 


1000 


6 


IS 


17 


23 


13 


15 


60 


38 


4 


8 


CIIAl 


^Ges 



nil I jlvi:^*'!! M](l|l<[; V«'<(l| (I III 11 lUl'kn til IIM 

Ml lOOn nnfl inn Ml IDOO 1100 1110 iiacHsccoiiai)icriH5.iMiriintiiMnc«. mmw 
\ K UUU nil) All lyuy uuu iuu mp]y liouinii ilio pmcm of these uidiiT 

Ho rtpMii 

A. I 




0 



2 



7 



22 



n 3 



6 



13 



2S 



7 



2 



2 



16 



3 



0 



1 



12 



1 



ath ^is « .5037 
T. 11.7257 



[1. n btm it \% I steep hill 

C, 1 or II I'DCDUJc jW^llIqSvfni'" - 
^ 1 bociiujo her i^tct] li tlun£lnc 

D. 1 Ii3vc no mivcr 

, REASON 



•Die iten work npproprlatdy for Btli graders. U 
dlstrinlnotlon as oxpocted for iwsters. 





IC 



K\\ \f J^l^ 



VERSION V 
12th 



All 1000 1100 lUO 



0 




93 



VRRSION VI 
8th 

All 1000 1100 1110 



1 



53 



3 



3 



:'20 



0 



0 



1 



Uaatst' T -10,0244 



flth ■ • -.4944 
. T- 11.7257 



A riiif, is held WthJtn a tJblr inJ a Jif,U Wlb, Tha llfht trslj i jh^dw pf tic rln; 
wilP tlif liil'le, If ilic j'lof i) ti:i':d closer to ihe tjhic, j!,ajw tjy; 

A, HfCPPic Urpcr bctlwc the >|i.iilw JiTfiitl* m / P 

flirvMil M Mch 
Ci Sldi' the jiiunp bcriu!ic it^ the itAc rin; 
D, UrctiRifl ]|irj:er beCAU^c the bplb i) (oth'^r ivpy 



REAS(»J 

1^6 iten has good dincrinination Although it is harder than . 
ABny in tho sot^ 



I8F 



VERSION IV VERSION V VliRSION VI 

12th 8th 

All 1100 lllO llll kll 1100 lllO nil A wvic pjccur spreads ilMlpM wt over O' X 3' «rtOh 9 fcM A^^^^^^^ TocJlc 
I lliP spicnj pvrr n 5' x 5' screen, hOH fur back mx the Htm be povfj? 



All 


1100 


lllO 


nil 


All 
0 


16 


^4 


1 


0 




58 


50 


93 


100 


84 


12 


6 


0 


0 




6 


0 


0 


0 




6 


0 


0 


0 






{GES I2th 



0 0 



19 



46 



II 



31 



26 



15 



IS 



0 



81 



Q No niponiD 

4 A, Aliout 1) Ik $ [w\ ii i mt Otu) IH 3 im m imj )l frri i» JL 

' ' " 2 (lore than 0 feci 



91 



0 



None 



17F 



Masters M2J015 



VERSION IV 
8th 



3th • % « .5863 
T - 14,9215 



0 AlioutJS fppl becjuso 3/9 « 5/lS 

C, AI«oiit \t feet Iirc(iii5r 9 « 3 • 17 

D. About Id fpci liccsdJc it Mioiilil k ihm Ivicc o^ fur 
B, 1 \\m no qiiwcr (? ^^^^ 

REASON ' 

The itcin works 
appropriately for this 
level. 



/ 




mm V 

I2th 



VERSION VI 
8th 



All 


1100 


1110 


IIU 


All 
0 


1100 


1110 
0 


lill 
0 


All 
1 


1100 
2 


1110 
0 


1111 J" 

n\ 

0 


9 


5 


0 


0 


0 




0 


0 


9 


2 


3 


0 


23 


44 


14 


0 


3 


1 ! 


0 


0 


IS 


18 


6 


4 


34 


2B 


71 


67 


?0 


A 


83 


64 


34 


16 


61 


65 


27 


22 


14 


33 


25 




10 


36 


30 


52 


25 


26 


4 


0 


0 


0 


3 


-a;— 


7 


0 


11 


11 


-ij 





CHANGES 
None 



l!F, 

VU5IU. IV 

ill IICO UIO illl* All 1100 lUl \ ^ » m^lMit (''Ml tmi']^ IM. ri, h ^ U U\ vtll It hivo (tivtled 



Ho response 

A, [('j \h kciiuse G « ii h 

iii 3 or ^ lbs Itcjiuic It is nure 



II, I \m nc ansver 




12th ^bis - ,S240 8th ^bis • .4609 
Hasterj T « 10.1080 T - 10.7061 



REASON 

ITie problem airhoufih difficult does discriainale. 



' 13 ■ 4: ] 0 1 I 






4 


4 i 0 ; o1 4 


i 




0 














J Ol 5 


< 


0 


4 


5 


5 


' 1 1 

7I O'j 5 




0 


< 



L lif*n J!H fi'H UlPuSf If It Mill I9\'kt ^ 
^ tUn ^(tl fed hierv'c It It only } iKiit^ilt ^rt / 

D, 300 (hi k?iii» 3 »». « 1 Ht. • i uc. . , ? 



*n 1100 illO nil «f,.,.,.,„a, K,,,.„inv,, 

Q I 0 ho ntpofli* 



Hiimi T. 11.7351 
Niobari in thi probl» vtrt changed, 



0 



1 



0 



I 

f 

3 
0 



T« 13,0^26 



The iiaplet Intfjeri were Inicndul to bt »m nidlly 
Mrntlfifid II proportional or iddtlvi, 



iOF, 



ith 



12lh 



Ml 


1100 1110 mil 

1 1 ! 


Ml 

0 


1100 


1110 

11 


nil Ji 

0 




n 




IT 




"a 

J.. 


JL 


J6 


:i 


31 


0 










_J 




0 


0 








.,JL 


_J 


1 


6 


0 


0 


3 




D 


4 



WD«t 7 m\mi htvm 3? 1 1 tip. » i 3/< li|t T 

'iji in^ np, MHip, ■ ]| Up. <J 
Ts ot 

JL 

/ 

9 



1, Si9llflculon of nunber uti». 

2, DistMctor T chwiud to w idditlon type, 



12th rblt * ,.5926 
KastfiTt T* 12.06^1 



mm VI 
eth 



Ml 1100 m Un .)iv.iM- .11, VlM- l.'.'hi'iO. r.. lilivt.: i..; H 

- ■ ' (M l!," ; II \l|ift r I., n i.Hi: 



1 1 J 


0 


0 


57 




W 


117 


17 


13 


J 


9 




21 


1 


4 


6 


5 


0 


0 


5 


J 


3 


0 



J 



.1: 

/ 



!th 'bil ■ ,S75J HEASffl 

' ' 1, Hi! Itei (KfiH (1 loe HHMU 

}, TM$ li I wrt tpproprlitt dlitiietor for Uvil II. 



2F 



8F, 



VERSION IV 
8th 



12tli 



VERSIOfl VI 
Bth 



All 


1100 


1110 


1111 


All 
0 


1100 


1110 
0 


nil 

o' 


AH 
1 


1100 
0 


1110 
0 


nil s 

PC 

k 

0 


37 


35 


57 


33 


62 




79 


57 


39 


27 


67 


83 


5 


0 


7 


0 


3 


H 


3 


0 


12 


8 


3 


0 


14 


11 


7 


0 


0 


"a— 


0 


0 


11 


ID 


1 


0 


26 


59 


29 


67 


30 


s 


17 


36 


18 


21 


13 


13 


16 


11 


0 


■ 0 


4 




0 


, ? 


18 


34 


15 


4 



12th 'bis- .0750 
Masters T> 1.2352 



ath 'bis* »5280 
T • 12J176 



t* in (Hi kum it lODll thpt Kiy 

C, 22 CO, Uum 19 O < 22 A 

D, ]P cn, but tbr ^tioics koiild be Inr^tr / 

E, 1 lioiffl tio tinnier 0 




Ttie Iten scens sound ^ wish ta 
have a Urpor group tested vith it, 



vi:iiiiwj IV 
ath 



12tli 



VERSION VI 
Bth 



All IICO 1)10 nil . Ml 1100 1110 nil iimlMkruhiUfMDirpiwfi. fi'ticWl h ) p^li ^Idiin w J Mcj iiir>, suul. *^^0 1110 llll a InrSirt-v •» cclilr,t iHc pf ji cS)mi mJ.s If ♦ i» i: th^.i 



All 


llCO 


1)10 


mil 


All 


1100 


1110 
0 


1111 llrr; 

tl 1 

hlf.li 

D 




6 


6 




± 




0 


4 . 












s 


_Q 


0 




10 




100 


9:) 


c 


97 


93 ( 


22 


53 


3 


0 


3 


s 


3 


0 


S 


6 


0 


0 






0 





K» flipMM 



■0 



AImiiI $ bcciuie .1 i i iliMit S ^ 



Aliout 7 liKiusi It hM t» U tort Zt 

il lliMit S 



n« ^)iwt I bctAJ^o U il } Dor« thin 7 3 
h I hivo Its :mi47 



CHA.NG2S 



< 



12th ms« 0.S132 , 
Kitten T * 9.S243 
Ficplice tkfl problem vith one thit is Iis) ibitruc. 




All 
0 


1100 

0 


1110 
0 


nil A 

Q 


3 


2 


0 


<] 


11 


8 


3 


0 


60 


60 


91 


IQO 


17 


16 


5 


0 


9 


IS 


3 


0 



Bth 'bil - ,5759 
T" 14,4468 



tl9 u\;m 



©About 2P btesusc (3 it ;b4-jl N . 
If , 

P« Alifbi 0 H;nu<( ffl it IS Mir ;j 
ti i h>kC r,» ir^MCT. 

REASON 



/ 

0 



TliB ltd hu lose (Qod charicttriitlcj but my bi 
hfvlQj the itudint pull together toK Ruy tiilnss. 



274 



VERSION IV 



.flth 



VERSION, V 
12th 



VERSION VI 
Bth 



Ml 


lUO 


nil 


All 


1110 


nil 


All 


lllp 


1111 


On 
111 








0 


0 


0 


0 


1 


0 




29 


0 


29 


41 


II 


16 


19 


4 


9 


0 


0 


6 


U 


0 


12 


6 


\ 


2S 


29 


0 


4 


1 


0 


17 


12 


0 


21 


36 


100 


52 


M 


75 


32 


36 


91 


22 


7 


0 


9 


H 


7 


22 


25 


0 



Ho reJpfns9 

m 

KiO k»iiso It is riDrn 

c, )r; iiiifflo It Rfc.^ 1? for 100 



p.] 13} becmisc IDO « 
U, 1 tiavo tiQ iiiisut*r 



0 



/iiplc Cart sTrTiip 
JO^ 4DII|; T 




la 



aiAJiGt-S I2th ^bl5 N 0.5095 8th ^bls " ,5591 
Masrors t «« 9.7293 T ■ 13,9015 

Nona 



I5F 



REASON 

The Itom appeared to be operating tipproprlatoly. 



VnilSION IV 

ath 



VERSION V 
12th 



VERSION VI 
8th 



All 


1110 


ml 


All 


1110 


nil 


All 


lUO 


nil \ 








0 


0 


0 


0 


1 


0 


18 


17 


30 


74 


66 


86 


37 


39 




13 


3 


5 


I 


3 


0 


16 


13 


0 


25 


20 


0 


22 


28 


14 


16 


25 


0 


38 


50 


65 


3 


3 


0 


19 


16 


9 


5 


10 


0 


0 


0 


0 


12 


4 


0 



A frcmtr Ji'vpt Urns irnd of tlif ilbtBiiff k travp)*, lie i\H\ thai In i niniitca hf 
trtivt-lR ) rill(!i/ ill jO t\\mis 7^ ni]es, If he continues nt !jicrd, \m Iw^ vill it 
tiilic liitit Id Iritvcl IP rillcst 



©Aliniit \\ ttitiults because U j 

. Jlwlii. Jdniiu inAUIfl. 7 

II) Alidiit 15 kliititr.i brrousc ID - f\ ■ l\ i.llti ^ V 



iJiMntiff Tip' 

nnipj uiii 

7>) Riles IQ^iiii 



M 10 1 2li ' )2<i rin. 
Ci AIkiiiI n nltiutM lircnti&p |.x 10 ■ 13 1/3 

P. About H bceaur^r 7'f t 3 ■ Ki^j 
kml iO'M p 14 



0 III 



0W3 12th ^bii i Q.S429 Bth 'his • .5148 
Maators T ■ 10.0235 T 12.3790 

None 



t, I litii'i' no niiM/rr 
REASON 

Tho Item discriminates well. Is excellent. 



I OF, 



VERSION IV 



V 



12th 



VEilSiaS VI 
Bth 



All 


1110 


un 


All 


lUfl 


nil 


Ml 


1110 


nil 


















lloi. 








0 


0 


0 


1 


0 


0 


ir 


50 


0 


36 


69 


7 


24 


26 


9 


12 


0 


0 


3 


Q 


0 


10 


1 


0 


38 


36 


100 


57 


28 


89 


45 


63 


83 


16 


14 


0 


1 


0 


0 


12 


3 


4 


8 


0 


0 


3 


3 


4 


8 


4 


4 



llcrt 1^ n reel pi" fcr \ en;') pf cofojs lirt lo mi Mlhifi 4 f, iDl 
. AilJvilh vtirriri; H. Mipr 
\ T. 



No retpon/ft 

A, ]( tiltlpsiwi l)cwp J 13 ■ 16 

ji. Naro tli;*!) 6 tililri^^i'its K^aurr tlitrc U vorr c^ct^a 



10 



)8 tIlMf5|M0,^^ because 6 r^wli ,18 



4 



^ I Mvc no i^n^^cr 



c 



Nono 



12th ^1b^ .5230 ath%- ,5053 REASON 
Masters T" 10.0820 T» 12.0709 

Tlie UcMorks wU, 'bis U approprUto, 



276 



VERSION IV 



VERSION V 
I2th 



VERSION VI 
Bth 



\\\\ 


1110 nil 

1 


All 
0 


1110 
0 


nil 


All 


ino 
o_ 




li/ 


. 7 


67 


74 


69 


ao 


16 


13 


57 ■ 


l\ 


71 


35 


11 


28 


14 


39 




35 • 


19 


1-1 


0 


0 


0 


0 


23 


36 


4 




0 








0 


14 


\ 


0 


„ 


0 


« — 
0 ; ' 


0 


5 


3 


4 



Ho ftipofi" } 

0) ft, iliick bocmiso 16 " < 
8 W T 



II, ) ft, \W\\ Ucnuiie 1 



C, ) /I, Ihltli \m^i 1 l»s limn I 

D, 1 fl, tlilcV. brraujc it sliould be li r 
1), I lint 110 utiiwcr 



D' 



CIWN'CES 
None 



i;.*- bis *C3B07J 6th ^is « D.3B76 
Masters T «« 6.76S5 T « 8,6678 



5F 



VERSION IV VERSION V 
flth 12th 



I 
0 



'Hie item seemed to discrltuinaie but bave high difficulty, 
I wished to see how it would wort with the 12th grade raanors. 

VERSION VI 
8th 



All 


nio 


nn 


All 
0 


nio 

0 


lUl 

0 


All 

2 ^ 


1110 
1 


11)1 Trim 
Trial 

0 ■ 


17 


13 


0 


32 


55 


7 


19 


34 


26 


34 


23 


95 


54 


21 


86 


25 


22 


C 




15 


0 


3 


3 


4 


22 


19 


)3 


IS 


47 


5 


12 


21 


4 


13 


9 


4 


6 


3 


0 


0 


0 


0 


18 


13 


0 



CI1A^'0ES 



TiIjI 1 • Tk'o tfcir.lilK cn sUc I'flancfi three of the sone VflcMs i-n flJf T 



4 

0/bcml 8 liecansci C 7.!! 



C, AliMit 7 liccaiisc 4 < 1 * S 
•i 

> 7 

p, About '7 Ificsuxc 6 is )ess lliiut 8 



/ 
0 



0 



12th ^i9 ■ ,4005 Bth '^bis » ,4fl66 
Masters T» 7.1822 T'*9»1.V 



llic item appenrcd to be working appropriately. 



VERSION iV 
Bth 



VERSION V 
12th 



VERSION VI 
8th 



CHANGES 
None 



All 
12 


1110 
0 


nil 

fl 


All 
3 


1110 
3 


nil 

4 


All 1110 
0 1 0 


nil Aril 
0 


14 


14 


33 


4 


(1 


4 


10 


6. 


0 . 


29 


43 


33 


28 


7 




r — 

20 


16 




25 


21 


0 


35 


52 


29 


24 


28 


30 


19 


21 


33 


1? 


2B 


0 


21 


31 


0 








19 • 


10 


29 


19 


18 


0 



No rfpfinie 

A, Tlic a'.iiditw kill k Iflifcr llim 9" ^liiiP\rr titc rliifi. / 



C. tout 5 ft. (rn (lie Itaji licciuic J 1 4 • 2,7/ ^3 



ir^ — A 

L i\ 



• 

t), Ahwil 3 ft. (m tlio lnrtj» imw the rini; lO" Z .. V y 



12th ^bis ■ 6th ^bis ■ ,4764 

HasterB T - 4,9718 U 11,1701 



Jsri.ci- 5 « ] ■ 4 3i>J U\, 4 ] ft, ■ 3 ft. 

li, I tiitvo no Dn.^krr 

REASO.V 

Wished to test with a larger, sample* 



,T 



278 




(F2 • level IV itcii) 



6F 



I2th 



Sih 



9F 



lAK 

!. 


1110 


CI 1 ! 0 


im 

D 








If) 




1) 


ll 11 


5 


.'1 


1 — 






i 

11 0 








37 


30 


1 


.n '! 24 1 vj 


IS 






1 

f, ! 0 




1 


4 




T - g,;206 



iRfil.'T nun 

A, /V-yf It lltcri bc»u;f J fil. ii 3 W( Ibw S jal. in.| U ll J ner» thin 21, J 



VBSION V 
2. Ktli 
All 1110 nil 



C, i\ lilin b't LK J f * Ji 

U)) 1( litiri I'tcme ? , ^ 1 II 
71 jJifl 



I4F 



I2;h 



C, I h.c no aiiArr 

T^.ls lifc; ;huiild be considered. It has pronlse of [jood 
dluri?iln4ion. It I5 m too lilfflcult. hiM iho 
2"^-S jal. co'ild be Just a 2*!l conp&rison anJ tho distractor) 
tbn siisolifltd. 



0 



•VEPSldN VI 

ath 

All 1110 1111 
2 



0 



(1 



^ 1(1 7(r\h T 



VliHMUl VI 
9th 



3F. 



'bli- .3732 
t« 6.6091 



bis « ,4?61 
T - 11.1607 



I. l-h*r«ii« i , 1 

T 
T 

C. 1" kfvt'c I \\ IfM lU I 
I V ^ 4 

(I. r* k^iiV' It iliciilJ tfr leu 



ll, I Uyc no ibt">r 

CQ'UIJ.'T 




/ 
<9 



Thli problcQ involves inv»>rje as ih* square vuiatlon. 
It Is difiicult uid probibl)- of an other l*vci. 



121I1 



Htli 



|iAa iiii,,.ui iiij I. 

i I 




.h I' ■ i i '! f'-' >U ,Mjf. |.,)f Ml. till; i,. Ip !Ml (il.i-Jr. 

:.nM' )'.•■ 1 I'.ii i<r I,. I . |,.ir „; ui.^ mrl' in 11'. >iil M^it-t, 

k/ l.jt Ik Ins) tl.'iiKr tfiibWii;; u fiir tTini lolU iliii It) i\[t M Inis 

III'. itf„ri r) 



C. .^inlrc u'.r odi I loit Idn,. rrj fmi', 
II, Jl=lc(.;l^v J Ur-im Uim J If, pigrc lliili I 

y ? V 



z 

3 
0 



Ml 

0 


lllC 
0 


llllj 

.J 


'^'1 


1110 


nil w;l 




•If; 






2.1 


.15 < 




u 




JL 




^ 


25 


2d 


IS 


20 


35 


26 


20 


17 


», 


31 


3S 


22 


0 


0 


'0 


'9 


4 


9 



ri ii'i i!i'.<;. l.lii(l;»t Uir ]>lvl<^<.'.. tl,ll,11l,lV) |Mti( ui ! liici; "tUf 



0»o rrvJ"" 



^bls« ,7097 .3936 



C, I or lirfiiii'i lit s'^''^ ^' t1\wiiK 
'III 

II, II Wu\'i l iii'.fh i.vl. kt.ii.l 

n, I li]\< III' wur 



/ 

0 



T - 16.S$IS . 



8.B366 



] * I.3S1'} 



12F 



VUiii;; V 



^h- .5395 
T « 15.9089 



mm VI 

(ifh 



HaJten do not react approprifltcly to this IteiB, l^o subtloty 
b(>l«en illitnctor D and fl - tlio key is pfcbalUjf 100 fine. 



Wns nfCJs some editing, Pfijsibl/ JUlTsciors 
"0" and "B" should be diin^ed. "^c itea has icnc 
ponlbi lilies. For *'C" u,c I or 111 ihould be on 
the im line. 



rn 



2R 



I . it 111 



M Wit} 



lllliAll iiio 



9 0 



IW.U Ml Py I'l ll.i. .Ml •j'ril.r ILyfi'.'. Ill' I'l'li-'i'- l''f J l""'^' 'I"'"''- I '"'tt- 



VEHSIOTJ V 
12t1i 



VEMION' VI 



— ... . ...'^L„ .!:.^ — 1,-,^ |.l. tf, k.J * in, , ' ' |i ' 

.. I (V) AbU kt-Oiti ir M.'Ia - 1,1 irt. /. 















u 


IT- 


11 


19 


24 


17 


12 


7 


i<; 




, 9 


17 


J« 


1 


4 


r' — ' 
:o 


IS 


9 



'^'bij. Mil 
T« 11,5104 



ilirn 

1' 

ll, M>i«il S tii't.'iiiM' f. |i"4>L'. - ? i>niih • \ Hi«U 
tuil ( ir>, • ) trt. ' i (0, 

mm 



3 
/ 
6 



Ml 
0 


1110' 
0 


nil 
0 


All 


1110 
1 


nil rtPltliH l-fsliPlUChr. iHHI'l 

Mttili li lir 
(ll prnnks? 






0 
■/■ 


JL 




;■; r0 Wwil JtrMutt 


I41 


ss 


19 


23 


24 






0 


0 


! i3 


7 


4 At»JMt J I'fm'f It h«» 1 




0 


4 


1 >^ 


4 


^ t 1 hit nv cMver 



■ ^ 



M»f»4 / 



ERIC 



^^^ifi quejtloo should ut jubitituted for oii« of the 
ones tiled in level Iv, 



^bli -TM* rbls • .5401 
T '^'A^'if T " i3'2289 



Hill (;tjflstirn should be lubitituted for oni of 
tlis poorer mi tucd in level IV, 



poorer 



280 



